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METHODS AND COMPOSITIONS RELATED TO HEIGHTENED APOBEC-1 
RELATED PROTEIN (ARP) EXPRESSION 

CROSS-REFERENCE TO RELATED APPLICATIONS 

This application claims benefit of U.S. Provisional Application No. 60/652,177, filed 
5 February 1 1 , 2005, which is hereby incorporated herein by reference in its entirety. 
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BACKGROUND OF THE INVENTION 

Human white blood cells express proteins called APOBEC-1 related proteins (ARPs), 
which are cytidine deaminases that can change the genetic code of an infecting virus. These 

15 changes can render the virus incapable of producing an infection when they occur in critical 

genes encoding viral proteins and/or when they occur extensively throughout the viral genome. 
APOBEC-1 related proteins (ARPs), such as CEM-15, APOBEC-3B, APOBEC-3C, and 
APOBEC-3F have been found to have a deleterious effect on HIV-1, HIV-2, retrovirus and 
hepatitis B. HIV-1, however, expresses a protein called Viral infectivity factor (Vif) that impairs 

20 the ability of ARPs such as CEM1 5 to act on viral DNA. 

A small subset of HIV-infected individuals, known as long-term nonprogressors 
(LNTPs) have substantially slower rates of disease progression in the absence of therapeutic 
intervention. Clinically, these LTNPs are usually asymptomatic, maintain high CD4 counts and 
low HIV viremia levels. The characteristics are therefore of prognostic value in evaluating 

25 disease severity. The mechanisms responsible for long-term nonprogression have been 

attributed to defective or less fit HIV variants, strong host immune responses and unique host 
genetic elements, such as the CCR5 genotype and HLA haplotypes (Buchbinder et al. (1999) 
Microbes and Infection 1:1113-1 120). 

Thus, needed in the art are methods and compositions related to determining the status 

30 and mechanisms underlying long-term nonprogression of viral infections. More specifically, the 
role of APOBEC-1 related proteins in viral progression and its affect in long-term 
nonprogressors is of importance. 
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SUMMARY OF THE INVENTION 

In accordance with the purposes of this invention, as embodied and broadly described 
herein, this invention, in one aspect, relates to a method of predicting the severity of a viral 
infection in a subject. For example, the level of expression of at least one APOBEC-1 related 
5 protein is used to indicate the level of severity. 

Also disclosed is a method of predicting whether a subject is or will be a long term 
nonprogressor (LTNP) when infected with a virus. A higher level of expression in a biological 
sample from the subject of one or more APOBEC-1 related proteins as compared to a control 
level indicates the subject is a potential LNTP. 
10 Further disclosed is a method of optimizing antiviral therapy in a subject with a viral 

infection. The level of expression of one or more APOBEC-1 related proteins (ARPs) in a 
biological sample from the subject is used to adjust the antiviral therapy, thereby optimizing the 
viral therapy. 

Also disclosed is a method of predicting the level of CD4 cells in a subject. The level of 
15 CEM15 correlates with the level of CD4 cells and can be used to predict the level of CD4 cells. 

Also disclosed is a method of monitoring effectiveness of an antiviral agent in a subject. 
Specifically, expression levels of one or more APOBEC-1 related proteins are monitored during 
the treatment. An increase in expression levels of the APOBEC-1 related proteins during the 
course of treatment indicates the effectiveness of the antiviral agent. 
20 Further disclosed is a method of screening for an antiviral agents and compositions used 

to detect levels of ARP expression, including nucleic acid primers and probes. 

Additional advantages of the invention will be set forth in part in the description which 
follows, and in part will be obvious from the description, or may be learned by practice of the 
invention. The advantages of the invention will be realized and attained by means of the 
25 elements and combinations particularly pointed out in the appended claims. It is to be 

understood that both the foregoing general description and the following detailed description are 
exemplary and explanatory only and are not restrictive of the invention, as claimed. 

BRIEF DESCRIPTION OF THE DRAWINGS 

30 The accompanying drawings, which are incorporated in and constitute a part of this 

specification, illustrate several embodiments of the invention and together with the description, 
serve to explain the principles of the invention. 

Figure 1 shows representative members of the APOBEC-1 related family of cytidine 
deaminases including CEM15. Also shown are APOBEC-1 complementation factor (ACF) and 
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viral infectivity factor (Vif). The catalytic domain of APOBEC-1 is characterized by a ZDD with 
three zinc ligands (either His or Cys), a glutamic acid, a proline residue and a conserved primary 
sequence spacing (Mian, I.S., et aL, (1998) J Comput Biol. 5:57-72.). The ZDD of other 
deaminases and APOBEC-1 related proteins is shown for comparison along with a consensus 
5 ZDD. The indicated residues in the catalytic site of APOBEC-1 bind AU-rich RNA with weak 
affinity. The leucine rich region (LRR) of APOBEC-1 has been implicated in APOBEC-1 
dimerization and shown to be required for editing (Lau, P.P., et aL, (1994) Proc Natl Acad Sci U 
S A. 91:8522-6; Oka, K., et aL, (1997) J Biol Chem. 272:1456-60.) but structural modeling 
suggests that LRR forms the hydrophobic core of the protein monomer (Navaratnam, N., et aL, 

10 (1998) J Mol Biol. 275:695-714.). ACF complements APOBEC-1 through its APOBEC-1 and 
RNA binding activities. The RNA recognition motifs (RRM)s are required for mooring 
sequence-specific RNA binding and these domains plus sequence flanking them are required for 
APOBEC-1 interaction and complementation (Blanc, V., et aL, (2001) J Biol Chem. 276:46386- 
93.; Mehta, A., et aL, (2002) RNA. 8:69-82.) APOBEC-1 complementation activity minimally 

15 depends on ACF binding to both APOBEC-1 and mooring sequence RNA. A broad APOBEC-1 
complementation region is indicated that is inclusive of all regions implicated in this activity 
(Blanc, V., et aL, (2001) J Biol Chem. 276:46386-93.; Mehta, A., et aL, (2002) RNA. 8:69-82.) 
Experiments have shown the N-terminal half of Vif is necessary for viral infectivity (Henzler, T. 
2001). However, reports have demonstrated that residues in the C-terminus (amino acids 151- 

20 164) are essential for infectivity (Yang, S. et aL 2001) and that multimerization of Vif through 
the motif PPLP (SEQ ID NO: 14) within this region was essential for infectivity. Peptides 
capable of binding to this domain of Vif blocked Vif- Vif interactions and Vif-Hck interactions in 
vitro and suppressed viral infectivity in cell-based assay systems. Residues in the N-terminus of 
Vif are essential for RNA binding and packing of Vif within the virion (Zhang et al. 2000; Khan 

25 et aL 2001 ; Lake et al. 2003). 

Figure 2 shows schematic depictions of the cytidine deaminase (CDA) polypeptide fold 
and structure-based alignments of APOBEC-1 with respect to its related proteins (ARPs). 
Figure 2a depicts a gene duplication model for cytidine deaminases. CDD1 belongs to the 
tetrameric class of cytidine deaminases with a quaternary fold nearly identical to that of the 

30 tetrameric cytidine deaminase from B. subtilis (Johansson, E., et aL, (2002) Biochemistry. 

41 :2563-70.). Such tetrameric enzymes exhibit the classical appapapp topology of the Zinc 
Dependent Deaminase Domain (ZDD) observed first in the Catalytic Domain (CD) of the 
dimeric enzyme from E. coli (Betts, L„ et aL, (1994) J Mol Biol. 235:635-56). According to the 
gene duplication model, an ancestral CDDl-like monomer (upper left ribbon) duplicated and 
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fused to produce a bipartite monomer. Over time a C-terminal Pseudo-Catalytic Domain (PCD) 
arose that lost substrate and Zn2+ binding abilities (upper right ribbon). The model holds that the 
interdomain CD-PCD junction is joined via flexible linker that features conserved Gly residues 
necessary for catalytic activity on large polymeric DNA or RNA substrates. The function of the 
5 PCD is to stabilize the hydrophobic monomer core and to engage in auxiliary factor binding. The 
loss of PCD helix al can provide a hydrophobic surface were auxiliary factors bind to facilitate 
substrate recognition thereby regulating catalysis. The enzymes remain oligomeric because each 
active site comprises multiple polypeptide chains. Modem representatives of the chimeric CDA 
fold include the enzyme from E. coli, as well as APOBEC-1 and AID. Other ARPs such as 

10 APOBEC-3G (CEM15) may have arisen through a second gene duplication to produce a 

pseudo-homodimer on a single polypeptide chain (lower ribbon); structural properties of the 
connector polypeptide are unknown. Signature sequences compiled from strict structure-based 
alignments (upper) are shown below respective ribbon diagrams, where X represents any amino 
acid. Linker regions (lines) and the location of Zn2+ binding (spheres) are depicted. Although 

15 experimental evidence suggests APOBEC-3B has reduced Zn2+ binding and exists as a dimer 
(Jarmuz, A., et al, (2002) Genomics, 79:285-96), modeling studies suggest it will bind Zn2+ (as 
shown in Wedekind et al Trends Genet, 19(4):207-16, 2003) and may function as a monomer. 
Inset spheres represent proper (222) CDDl-like quaternary structure symmetry whereas 
APOBEC-1 -like enzymes exhibit pseudo-222 symmetry relating CD and PCD subunits; in the 

20 latter enzyme a proper dyad axis relates the polypeptide chains. Finally, APOBEC-3G can fold 
as a monomer from a single polypeptide chain with each CD and PCD (differently colored 
spheres in lower left inset box) related by improper 222 symmetry with no strict axes of 
symmetry. Figure 2b depicts the structure based sequence alignment for ARPs. Sequences from 
human APOBEC-1, AID, and APOBEC-3G were aligned based upon a main-chain alpha-carbon 

25 least-squares superposition of the known cytidine deaminase three dimensional crystal structures 
from E. colU B. subtilis and S. cerevisiae (Figure 2c). Amino acid sequence alignments were 
optimized to minimize gaps in major secondary structure elements, which are depicted as tubes 
(a-helices) and arrows (P-strands) in Figure 2b. Additionally, loops, turns, and insertions of 
Figure 2b are marked L and T and i, respectively. L-Cl and L-C2 represent distinct loop 

30 structures in the dimeric versus tetrameric cytidine deaminases. Sections of basic residues that 
overlap the bipartite NLS of APOBEC-1 are marked BP-1 and BP-2. Figure 2d depicts a 
schematic diagram of the domain structure observed in APOBEC-1 and related ARPs based 
upon computer-based sequence alignments using the ZDD signature sequence shown in the 
lower panel of Figure 2a. 
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Figure 3 shows the relation of CEM15 amino acid sequence to APOBEC-1 and other 
APOBEC-1 Related Proteins (ARPs) by use of standard computational methods based upon 
amino acid similarity or identity. Amino acid sequence alignments illustrate conservation of 
Zn 2+ ligands and key catalytic residues essential to the mechanism of hydrolytic deamination by 
5 cytidine deaminases (CD A). Collectively, these amino acids form a signature zinc-dependent 
deaminase domain (ZDD), present in: (i) APOBEC-1, which mediates C to U editing of apoB 
mRNA, (ii) the Activation Induced Deaminase (AID), which mediates Somatic Hypermutation 
(SHM) and Class Switch Recombination (CSR), and (iii) CEM15, which blocks HIV-1 viral 
infectivity. 

10 Figure 4 shows reduced production of pseudotyped HIV-1 viral particles by cells 

expressing CEM15 or DM. p24 concentration (pg/ml) normalized to % GFP containing cells (as 
a measure of transfection efficiency) for 293T cells stably expressing pIRES-P vector (n=6), 
CEM15 (n=6) and DM (n=5), following transfection with wild-type (Vif+) or AVif proviral 
DNA plasmids (black and white bars, respectively). Error bars represent standard deviation 

15 calculated from n for each cell line. 

Figure 5 shows CEM15 suppresses HIV-1 protein abundance. 293T cell lines stably 
expressing (A) CEM15, (B) DM, and (C) control pIRES-P vector were transiently 
transfected with proviral HIV-1 plasmids (containing either wild-type Vif (+) or AVif (-)). 
Total cell lysates were prepared at 24, 48, and 72 hours post-transfection, separated by SDS- 

20 PAGE and analyzed by immunoblot assay using antibodies reactive with HA (HA-tagged 
CEM15 and DM), Vif, p24, RT, B-actin, Vpr, or Tat (as denoted on the left). The molecular 
weight (kDa) of the indicated protein species is given to the right. 

Figure 6 shows CEM15 suppresses HIV-1 viral RNA abundance. (A) Location of Gag- 
Pol junction and protease region of HIV-1 genomic RNA corresponding to the GP-RNA 

25 probe used for RNA binding and Northern blot analysis. (B) UV crosslinking of 
increasing concentration of recombinant CEM15 protein (1,2 and 4 \ig protein) to 20 finol 
radiolabeled GP-RNA and apoB RNA. (C) Poly A+ RNA abundance for Gag-Pol transcripts in 
293T-CEM15 at 24, 48, and 72 hours and DM cells at 48 hours post- transfection with 
Vif+ (black) and AVif (white) proviral DNA. Results are expressed as the ratio of viral RNA 

30 (GP-RNA region) to endogenous cellular RNA (adenovirus El A) determined through 
phosphorimager scanning densitometry analysis of Northern blots. 

Figure 7 shows real-time PCR assay for CEM15 gene expression. Samples of polyA+ 
mRNA were amplified from a positive control (CEM15 plasmid patient samples (patient #7 is 
shown) and internal control of GAPDH (inset) from reverse transcribed cDNA using the real- 
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time PCR as described in Example 1. Results showed linear amplification of CEM15 and 
GAPDH mRNAs from human PBMC. Using this assay, CEM15 and GAPDH mRNAs were 
quantified in each patient sample (Table 3). 

Figure 8 shows the protective effects of increased CEM15 gene expression. Single linear 
5 regression analysis between GAPDH normalized CEM15 mRNA levels and HIV viremia or 
CD4 counts in eight HIV-infected individuals. Results showed a strong inversed correlation 
between CEM15 gene expression and viremia levels (a), and a significant positive correlation 
between CEM15 gene expression and CD4 counts (b). 

1 0 DETAILED DESCRIPTION 

The APOBEC-1 and APOBEC-1 related compositions described herein are useful in 
preventing or treating viral infections. Described herein are methods of identifying long term 
nonprogressors, optimizing antiviral infectivity therapy, predicting the severity of a viral 
infection in a subject, predicting the level of CD4 cells in a subject, monitoring the effectiveness 

15 of an antiviral agent, and screening for an antiviral agent. Also disclosed are nucleic acid 
sequences used to detect expression of ARPs. 

The present invention may be understood more readily by reference to the following 
detailed description of preferred embodiments of the invention and the Examples included 
therein and to the Figures and their previous and following description. 

20 Before the present compounds, compositions, articles, devices, and/or methods are 

disclosed and described, it is to be understood that this invention is not limited to specific 
synthetic methods, specific recombinant biotechnology methods unless otherwise specified, or to 
particular reagents unless otherwise specified, as such may, of course, vary. It is also to be 
understood that the terminology used herein is for the purpose of describing particular 

25 embodiments only and is not intended to be limiting. 

As used in the specification and the appended claims, the singular forms "a," "an" and 
"the" include plural referents unless the context clearly dictates otherwise. Thus, for example, 
reference to "a pharmaceutical carrier" includes mixtures of two or more such carriers, and the 
like. 

30 Ranges may be expressed herein as from "about" one particular value, and/or to "about" 

another particular value. When such a range is expressed, another embodiment includes from 
the one particular value and/or to the other particular value. Similarly, when values are 
expressed as approximations, by use of the antecedent "about," it will be understood that the 
particular value forms another embodiment. It will be further understood that the endpoints of 
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each of the ranges are significant both in relation to the other endpoint, and independently of the 
other endpoint. It is also understood that there are a number of values disclosed herein, and that 
each value is also herein disclosed as "about" that particular value in addition to the value itself. 
For example, if the value "10" is disclosed, then "about 10" is also disclosed. It is also 
5 understood that when a value is disclosed that "less than or equal to" the value, "greater than or 
equal to the value" and possible ranges between values are also disclosed, as appropriately 
understood by the skilled artisan. For example, if the value "10" is disclosed the "less than or 
equal to 10"as well as "greater than or equal to 10" is also disclosed. 

In this specification and in the claims which follow, reference will be made to a number 
10 of terms which shall be defined to have the following meanings: 

"Optional" or "optionally" means that the subsequently described event or circumstance 
may or may not occur, and that the description includes instances where said event or 
circumstance occurs and instances where it does not. 

The terms "higher," "increases," "elevates," "enhances," or "elevation" refer to increases 
15 as compared to a control. The terms "low," "lower," "reduces," "suppresses" or "reduction" 
refer to decreases as compared to a control level. Control levels can be normal in vivo levels 
prior to, or in the absence of, an infection or a treatment. Thus, the control can be from the same 
subject prior to infection or treatment or can be an uninfected or untreated control subject or 
group thereof. 

20 The term "test compound" is defined as any compound to be tested for its ability to bind 

to increase ARP activity, production, or expression. "Test compounds" include drugs, 
molecules, and compounds that come from combinatorial libraries where thousands of such 
ligands are screened by drug class. 

By "subject" is meant an individual. Preferably, the subject is a mammal such as a 

25 primate, and, more preferably, a human. The term "subject" can include domesticated animals, 
such as cats, dogs, etc., livestock (e.g., cattle, horses, pigs, sheep, goats, etc.), and laboratory 
animals (e.g., mouse, rabbit, rat, guinea pig, etc.). 

The terms "control levels" or "control cells" are defined as the standard by which a 
change is measured, for example, the controls are not subjected to the experiment, but are 

30 instead subjected to a defined set of parameters, or the controls are based on pre- or post- 
treatment levels. 

By "contacting" is meant an instance of exposure of at least one substance to another 
substance. For example, contacting can include contacting a substance, such as a cell, or cell to 
a test compound described herein. A cell can be contacted with the test compound, for example, 
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by adding the protein or small molecule to the culture medium (by continuous infusion, by bolus 
delivery, or by changing the medium to a medium that contains the agent) or by adding the agent 
to the extracellular fluid in vivo (by local delivery, systemic delivery, intravenous injection, 
bolus delivery, or continuous infusion). The duration of contact with a cell or group of cells is 
5 determined by the time the test compound is present at physiologically effective levels or at 
presumed physiologically effective levels in the medium or extracellular fluid bathing the cell. 
In the present invention, for example, a virally infected cell (e.g., a, HIV infected cell) or a cell 
at risk for viral infection (e.g., before, at about the same time, or shortly after HIV infection of 
the cell) is contacted with a test compound. 

10 "Treatment" or "treating" means to administer a composition to a subject or a system 

with an undesired condition or at risk for the condition. The condition can be any pathogenic 
disease, autoimmune disease, cancer or inflammatory condition. The effect of the administration 
of the composition to the subject can have the effect of but is not limited to reducing the 
symptoms of the condition, a reduction in the severity of the condition, or the complete ablation 

15 of the condition. 

By "effective amount" is meant a therapeutic amount needed to achieve the desired result 
or results, e.g., reducing viral infectivity, blunting physiological functions, altering the 
qualitative or quantitative nature of the proteins expressed by cell or tissues, and eliminating or 
reducing disease causing molecules and/or the mRNA or DNA that encodes them, etc. 

20 Herein, "inhibition" or "suppression" means to reduce activity as compared to a control 

(e.g., activity in the absence of such inhibition). It is understood that inhibition or suppression 
can mean a slight reduction in activity to the complete ablation of all activity. An "inhibitor" or 
"suppressor" can be anything that reduces the targeted activity. 

"Suppression of viral activity" is defined as a 10%, 20%, 30%, 40%, 50%, 60%, 70%, 

25 80%, 90%, 100%, 2-fold, 10-fold, 100-fold, or 1000-fold suppression of viral activity. Viral 
activity includes, but is not limited to, viral reproduction, viral shedding, or viral infectivity. 

Many methods disclosed herein refer to "systems." It is understood that systems can be, 
for example, cells, columns, or batch processing containers (e.g., culture plates). A system is a 
set of components, any set of components that allows for the steps of the method to performed. 

30 Typically a system will comprise one or more components, such as a protein(s) or reagent(s). 
One type of system disclosed would be a cell that comprises both Vif and a test compound, for 
example. Another type of system would be one that comprises a cell and an infective unit (e.g., 
an HIV unit). A third type of system might be a chromatography column that has CEM15 or 
other ARPs bound to the column. 
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By "virally infected mammalian cell system" or "virally infected" is meant an in vitro or 
in vivo system infected by a virus. Such a system can include mammalian cellular components; 
mammalian cells, tissues, or organs; and whole animal systems. By "HIV infectivity" or "viral 
infectivity" is meant the capacity of an in vitro or in vivo system to become infected by an virus 
5 (e.g., an HIV virus). 

By "Vif antagonist" is meant amy molecule or composition that counteracts, reduces, 
suppresses, inhibits, blocks, or hinders the activity of a Vif molecule or a fragment thereof This 
includes Vif dimerization antagonists, which reduce, suppress, inhibit, block, or hinder the 
dimerization of Vif. Any time a "Vif antagonist" is mentioned, this includes Vif dimerization 
10 antagonists. Also included are agents that block Vif binding to the CEM15, agents that block 
Vif-mediated polyubiquitination of CEM15, and the like. 

By "cytidine deaminase activator" is meant any molecule or composition that enhances 
or increases the activity of a cytidine deaminase molecule or a fragment thereof By cytidine 
deaminase activator is also meant deoxycytidine deaminase activator, ARP activator, or any 
1 5 related molecule. 

By "deoxycytidine deaminase activator" is meant any molecule or composition that 
enhances or increases the activity of a deoxycytidine deaminase molecule or a fragment thereof. 

By "ARP activator" is meant any molecule or composition that enhances or increases the 
activity of an APOBEC-1 Related Protein molecule or a fragment thereof. 
20 ' A "cytidine deaminase-positive cell" means any cell that expresses one ore more cytidine 

deaminases or deoxycytidine deaminases. Such express can be naturally occurring or the cell 
can include an exogenous nucleic acid that encodes one ore more selected deaminases. 

"Primers" are a subset of probes that are capable of supporting some type of enzymatic 
manipulation and that can hybridize with a target nucleic acid such that the enzymatic 
25 manipulation can occur. A primer can be made from any combination of nucleotides or 

nucleotide derivatives or analogs available in the art which do not interfere with the enzymatic 
manipulation. 

There are several examples of cellular and viral mRNA editing in mammalian cells. 
(Grosjean and Benne (1998); Smith et al. (1997) RNA 3: 1 105-23). Two examples of such 
30 editing mechanisms are the adenosine to inosine and cytidine to uridine conversions. (Grosjean 
and Benne (1998); Smith et al. (1996) Trends in Genetics 12:418-24; Krough et al. (1994) J. 
Mol. Biol. 235:1501-31). Editing can also occur on both RNA and on DNA, and typically these 
functions are performed by different types of deaminases. 



-9- 



WO 2007/126402 PCT/US2006/004920 
A to I editing involves a family of adenosine deaminases active on RNA (ADARs). 
ADARs typically have two or more double stranded RNA binding motifs (DRBM) in addition to 
a catalytic domain whose tertiary structure positions a histidine and two cysteines for zinc ion 
coordination and a glutamic acid residue as a proton donor. The catalytic domain is conserved at 
5 the level of secondary and tertiary structure among ADARs, cytidine nucleoside/nucleotide 
deaminases and CDARs but differs markedly from that found in adenosine 
nucleoside/nucleotide deaminases (Higuchi et al (1993) Cell 75:1361-70). ADAR editing sites 
are found predominantly in exons and are characterized by RNA secondary structure 
encompassing the adenosine(s) to be edited. In human exon A to I editing, RNA secondary 

10 structure is formed between the exon and a 3' proximal sequence with the downstream intron 

(Grosjean and Benne (1998); Smith et al. (1997) RNA 3: 1 105-23; Smith et al. (1996) Trends in 
Genetics 12:418-24; Maas et al (1996) J. Biol. Chem. 271:12221-26; Reuter et al. (1999) Nature 
399:75-80; O'Connell (1997) Current Biol. 7:R437-38). Consequently, A to I editing occurs 
prior to pre-mRNA splicing in the nucleus. The resultant inosine base pairs with cytosine and 

15 codons that have been edited, effectively have an A to G change. ADAR mRNA substrates 

frequently contain multiple A to I editing sites and each site is selectively edited by an ADAR, 
such as ADAR1 or ADAR2. ADARs typically function autonomously in editing mRNAs. 
ADARs bind secondary structure at the editing site through their double stranded RNA binding 
motifs or DRBMs and perform hydrolytic deamination of adenosine through their catalytic 

20 domain. 

One example of a Cytosine Deaminase Active on RNA (CD AR) is APOBEC-1 
(apolipoprotein B mRNA editing catalytic subunit 1) (accession # NM_005889) encoded on 
human chromosome 12. (Grosjean and Benne (1998); Lau et al. (1994) PNAS 91:8522-26; 
Teng et al (1993) Science 260:1816-19). APOBEC-1 edits apoB mRNA primarily at nucleotide 

25 6666 (C6666) and to a lesser extent at C8702 (Powell et al. (1987) Cell 50:831-40; Chen et al. 
(1987) Science 238: 363-366; Smith (1993) Seminars in Cell Biology 4:267-78) in a zinc 
dependent fashion (Smith et al. (1997) RNA 3:1 105-1 123). This editing creates an in-frame 
translation stop codon, UAA, from a glutamine codon, CAA at position C6666 (Grosjean and 
Benne (1998); Powell et al. (1987) Cell 50:831-840; Chen et al. (1987) Science 238:363-66). 

30 The biomedical significance of apoB mRNA editing is that it results in increased production and 
secretion of B48 containing very low density lipoproteins and correspondingly, a decrease in the 
abundance of the atherogenic apoBlOO containing low density lipoproteins in serum (Davidson 
et al. (1988) JBC 262:13482-85; Baum et al. (1990) JBC 265:19263-70; Wu et al. (1990) JBC 
265:12312-12316; Harris and Smith (1992) Biochem. Biophys. Res. Commun. 183:899-903; 
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Inui et al. (1994) J. Lipid Res. 35:1477-89;Funahashi et al (1995) J. Lipid Res. 36:414-428; 
Giannoni et al. J. Lipid Res. 36:1664-75; Lau et al. (1995) J. Lipid Res. 36: 2069-78; Phung et 
al. (1996) Metabolism 45:1056-58; Van Mater et al. (1998)Biochem. Biophys. Res. Commun. 
252:334-39; von Wronski et al. (1998) Metab. Clin.Exp. 7:869-73). 
5 Activation induced deaminase, AID (GenBank accession # BC006296) is encoded on 

human chromosome 12 (Muto, 2000); (Muramatsu et al. (1999) JBC 274: 18740-76; Muramatsu 
et al. (2000) Cell 102:553-64; Revy et al. (2000) Cell 102:565-76). AID contains a ZDD (Zinc- 
dependent deaminase domain) and has 34% amino acid identity to APOBEC-1 (Table 3, Figure 
5 and 6). Its location on human chromosome 12pl3 suggests it maybe related to APOBEC-1 by 

10 a gene duplication event (Lau, 1994; Muto, 2000). This chromosomal region has been 

implicated in the autosomal recessive form of Hyper-IgM syndrome (HIGM2) (Revy, 2000). 
Most patients with this disorder have homozygous point mutations or deletions in three of the 
five coding exons, leading to missense or nonsense mutations (Revy, P., 2000) Cell. 102:565- 
75). Significantly, some patients had missense mutations for key amino acids within AID f s ZDD 

15 (Revy, 2000; Minegishi, 2000). AID homologous knockout mice demonstrated that AID 

expression was the rate limiting step for class switch recombination (CSR) and required for an 
appropriate level of somatic hypermutation SHM (Muramatsu, 2000). The expression of AID 
controls antibody diversity through multiple gene rearrangements involving mutation of DNA 
sequence and recombination. 

20 Human APOBEC-2 (Genbank Accession # XM004087) is encoded on chromosome 6 

and is expressed uniquely in cardiac and skeletal muscle (Liao et al. Biochem Biophys. Res. 
Commun. 260:398-404). It shares homology with APOBEC-l's catalytic domain, has a 
leucine/isoleucine-rich C-terminus and a tandem structural homology of the ZBD in its C- 
terminus. APOBEC-2 deaminated free nucleotides in vitro but did not have editing activity on 

25 apoB mRNA. 

Human phorbolin 1, phorbolin 1 -related protein, phorbolin-2 and -3 share characteristics 
with C to U editing enzymes. Several proteins with homology to APOBEC-1 named Phorbolins 
1, 2, 3, and Phorbolin-1 related protein were identified in skin from patients suffering from 
psoriasis and were shown to be induced (in the case of Phorbolins 1 and 2) in skin treated with 
30 phorbol 12-myristate-l -acetate (Muramatsu, M. et al. (1999) J Biol Chem. 274:18470-6). The 
genes for these proteins were subsequently renamed as members of the APOBEC-3 or ARP 
family locus (Table 1) (Madsen, P. et al. (1999) J Invest Dermatol. 1 13:162-9). Bioinformatic 
studies revealed the presence of two additional APOBEC-1 related proteins in the human 
genome. One is an expressed gene (XM_092919) located just 2 kb away from APOBEC-3 G, 
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and is thus likely to be an eighth member of the family. The other is at position 12q23, and has 
similarity to APOBEC-3G. 

ARP variants show homology to cytidine deaminases (Figure 2d). As anticipated from 
the SBSA, Some of these proteins bind zinc and have RNA binding capacities similar to 
i. 5 APOBEC-1 Jarmuz, A., et al, (2002) Genomics, 79:285-96). However, analysis of APOBEC- 
3 A, -3B and -3G revealed them unable to edit apoB mRNA Jarmuz, A., et al, (2002) Genomics, 
79:285-96); Muramatsu, M. et al. (1999) J Biol Chem. 274:18470-6). It has been shown that the 
frequency of deleterious mutations in HIV and impaired infectivity correlated with the 
expression of CEM15 (APOBEC-3G) (Sheehy et al, 2002; Mariani et al, 2003; Mangeat et al, 
10 2003; Harris et al, 2003; Lecossier et al, 2003. 
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HIV expressing functional Vif (viral infectivity factor) protein is able to overcome the 
effects of CEM15 due to the ability of Vif to bind and target fit or ubiquitinate and distruct in the 
proteasome (Mariani et aL, Cell 114:21-31, 2003; Stopal et al. Mol. Cell 12:591-601, 2003; Yu 
5 et al. Nat Struct Mol. Biol 1 1 :435-42, 2004). In contrast, it is unlikely that APOBEC-3D and 3E 
function as an APOBEC-1 like editases because they are missing fundamental sequence 
elements that are required for mRNA editing by both APOBEC-1 and CDD1 (Anant, S. et al. 
(2001) Am J Physiol Cell Physiol. 281:C1904-16; Dance et al 2001), and experimental evidence 
shows an impaired ability to coordinate Zn 2+ and deaminate cytidine Jarmuz, A., et aL, (2002) 

10 Genomics, 79:285-96) APOBEC-3E appears to be a pseudogene (Jarmuz, A., et al 9 (2002) 
Genomics, 79:285-96), yet the EST database shows that APOBEC-3D and APOBEC-3E are 
alternatively spliced to form a single CD-PCD-CD-PCD encoding transcript. 

Additionally, it has been shown that rat APOBEC-1, mouse APOBEC-3, and human 
APOBEC-3B, are able to inhibit HIV infectivity even in the presence of Vif. Like APOBEC-3 G, 

1 5 human APOBEC-3F preferentially restrict vif-deficient virus. The mutation spectra and 
expression profile of APOBEC-3F indicate that this enzyme, together with APOBEC-3 G, 
accounts for the G to A hypermutation of proviruses described in HIV-infected individuals 
(Bishop et al., Curr. Bio. 14:1392-1396, 2004). In accordance with this, it has also been shown 
that APOBEC-3F blocks HIV-1 and is suppressed by both the HIV-1 and HIV-2 Vif proteins 

20 (Zheng et al, J Virol 78(1 1): 6073-6076, 2004; Wiegand et al, EMBO 23:2451-58, 2004). The 
limited tissue expression, and association with pre-cancerous and cancerous cells (Table 1), and 
in the case of APOBEC-3G, antagonism of the HIV viral protein Vif shows specific roles for the 
APOBEC-3 family in growth/cell cycle regulation and antiviral control. 

CEM15 (APOBEC-3G) has also been shown to interfere with other retroelements, 

25 including but not limited to hepatitis B virus (HBV) and murine leukemia virus (MLV). The 
methods and compositions described herein are useful with any of these viruses (Bishop et al., 
Curr. Bio. 14:1392-1396, 2004; Machida et al., PNAS 101(12):4262-67, 2004; Turelli et al., 
Science, 303:1829, 2004). 

Human HIV-1 virus contains a 10-kb single-stranded, positive-sense RNA genome that 

30 encodes three major classes of gene products that include: (j) structural proteins such as Gag, Pol 
and Env; (ii) essential trans-acting proteins (Tat, Rev); and (Hi) "auxiliary" proteins that are not 
required for efficient virus replication in at least some cell culture systems (Vpr, Vif, Vpu, Nef). 
Among these proteins, Vif is required for efficient virus replication in vivo, as well as in certain 
host cell types in vitro (Fisher et al. Science 237(4817):888-93, 1987; Strebel et al. Nature 
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328(6 132):728-30, 1987) because of its ability to overcome the action of a cellular antiviral 
system (Madani et al. J Virol 72(12):10251-5, 1998; Simon et al. Nat Med 4(12):1397-400, 
1998). 

The in vitro replicative phenotype of W/^deleted molecular clones of HIV- 1 is strikingly 
5 different in vz/-permissive cells (e.g. 293T, SUPT1 and CEM-SS T cell lines), as compared to 
vz^non-permissive cells (e.g. primary T cells, macrophages, or CEM, H9 and HUT78 T cell 
lines). In the former cells, v/f-deleted HIV-l clones replicate with an efficiency that is 
essentially identical to that of wild-type virus, whereas in the latter cells, replication of vif- 
negative HIV-l mutants is arrested due to a failure to accumulate reverse transcripts and 

10 inability to generate infectious pro viral integrants in the host cell (Sova et al. J Virol 

67(10):6322-6, 1993; von Schwedler et al J Virol 67(8):4945-55, 1993; Simon et al J Virol 
70(8):5297-305, 1996; Courcoul et al J Virol 69(4):2068-74, 1995). These defects are due to 
the expression of the host protein CEM 15 (Sheehy, A.M., et al., (2002) Nature. 418:646-650) in 
non-permissive cells for vz/minus viruses. CEM 15 antiviral activity is derived from effects on 

15 viral RNA or reverse transcripts (Sheehy, A.M., et al., (2002) Nature. 418:646-650). CEM 15 
deaminates dC to dU as the first strand of DNA is being made by reverse transcriptase or soon 
after its completion, and this results in dG to dA changes at the corresponding positions during 
second strand DNA synthesis (Harris et al Cell 1 13:803-809, 2003). 

Primary sequence alignments (Figure 3) and the structural constraints relating CD As to 

20 APOBEC-1 indicate that CEM 15 evolved from an APOBEC-l-like precursor by gene 

duplication (Wedekind et al. Trends Genet 19(4): p. 207-16, 2003). The resulting CEM 15 
structure exhibits two active sites per polypeptide chain with the topology CD1-PCD1- 
connector-CD2-PCD2. Knowledge of the structural homology among CD As and ARPs is 
sufficient to understand how features of CEM 15 contribute to its anti-viral activity. 

25 Vif interacts with CEM1 5 and induces its poly-ubiquitination and degradation through 

the proteosome, thereby reducing the abundance of CEM 15 and promoting viral infectivity. It 
has been discovered that Vif homodimers were required for Vif s interaction with CEM 15 (Yang 
et al. J Biol Chem. 278(8): 6596-602 (2003) and US Patent 6,653,443, herein incorporated by 
reference in their entirety). 

30 Stably expressed CEM 15 significantly reduced the level of pseudotyped HIV-l particles 

lacking Vif. The reduced viral particle production is the result of a selective suppression 
of viral RNA leading to reduction in essential HIV-l proteins. These effects were not 
observed when Vif was expressed due to the marked reduction of CEM 15. Although CEM 15 
was required to deplete viral particle production its deaminase function was not necessary. The 
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data indicate an antiviral mechanism in producer cells which is potentially significant late during 
the viral life cycle that involves directly or indirectly the RNA binding ability of CEM15 and 
does not require virion incorporation of CEM15 deaminase activity during viral replication. 
Thus, agents that enhance CEM15 selective binding to viral RNA, leading to viral RNA 
5 destruction result in a reduction in viral particle production and a reduced viral burden for the 
subject. 

Disclosed herein are methods of predicting the severity of a viral infection in a subject. 
In one embodiment, the method of predicting the severity of a viral infection in a subject 
comprises the steps of acquiring a biological sample from the subject; and measuring the level of 
10 expression of one or more APOBEC-1 related proteins in the subject, wherein a higher level of 
expression as compared to a reference level indicates decreased severity. The reference level is a 
selected control level. Alternatively, the reference level can be from a severely infected 
subject(s) and a lower level of expression in the test subject would indicate a less severe viral 
infection. 

15 As disclosed above, greater amounts of APOBEC-1 related proteins can indicate a less 

severe viral infection. CEM15 contributes to the control of viral replication (Sheehy et al. (2002) 
Nature 418:646-650; Mariani et al. (2002) Cell 1 14:21-31). A number of studies have examined 
the role of CEM15 in suppressing HIV replication in vitro (Harris et al. (2003) Cell 1 13:803- 
809; Zhang et al. (2003) Nature 424:94-98; Mangeat et al. (2003) Nature 424:99-103). Also, 

20 CEM15 genetic variants can influence HIV disease progression (An et al. (2004) J. Virol. 

78:1 1070-1 1076). Increased CEM15 gene expression provides a competitive advantage in that 
viral Vif is not able to destroy all of the enzyme prior to each round of infection and packaging, 
and consequently over time mutations in the HIV genome accumulate to the point of debilitating 
the virus and suppressing infectivity. Consequently, a slower rate of HIV disease progression is 

25 observed in patients with elevated CEM15 expression capacity. 

In the method described above, the test subject can have a viral infection when the levels 
of expression are measured. Alternatively, the subject may be free of the viral infection in 
question, and still be tested to determine the likely response of the subject to a potential viral 
infection. 

30 Decreased severity can result in an increased longevity in the subject as compared to a 

control. For example, if greater levels of an APOBEC-1 related protein are found, the individual 
can be expected to live 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, or 1 1 months longer, or 1, 2, 3, 4, 5, 6, 7, 8, 9, 
10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,21,22, 23, 24, 25, or more years longer compared to a 
control. The decreased severity can also comprise a longer asymptomatic period in the subject as 
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compared to a control. For example, the subject can remain asymptomatic for 1, 2, 3, 4, 5, 6, 7, 
8, 9, 10, or 11 months longer, or 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 
21, 22, 23, 24, 25, or more years longer compared to a control. Further, the decreased severity 
can result in reduced symptoms of the viral infection (e.g., reduced fever, reduced inflammation, 
5 and reduced secondary infections.) 

The decreased severity can be manifest in a number of different ways. For example, the 
decreased severity can comprise high CD4 counts as compared to a control. The CD4 count has 
been used as a measurement to determine the strength of the immune system. It can also be used 
to judge how far a viral infection is advanced (the stage of the disease), and helps predict the risk 

10 of complications and opportunistic infections. The CD4 count can be compared with a count 

obtained from an earlier test in the same subject. The CD4 count can also be used in combination 
with the viral load test, which measures the level of HIV in the blood, to determine the staging 
and outlook of the disease. A CD4 count and a viral load test are usually ordered when a subject 
is diagnosed with a virus, such as HIV, as part of a baseline measurement. Both tests are 

15 commonly repeated about four weeks after starting anti-HIV therapy. If treatment is maintained, 
a CD4 count can be performed every three to four months thereafter, for example. 

Normal CD4 counts in adults range from 500 to 1,500 cells per cubic millimeter of 
blood. In general, the CD4 count goes down as the viral disease progresses. According to public 
health guidelines, preventive therapy should be started when an HIV-positive person who has no 

20 symptoms registers a CD4 count under 350. The Centers for Disease Control and Prevention 
considers HIV-infected persons who have CD4 counts below 200 to have AIDS, regardless of 
whether they are symptomatic. 

The decreased severity can also comprise lower HIV viremia levels as compared to a 
control. Quantitative measurements of HIV viremia in peripheral blood have shown that higher 

25 virus levels can be correlated with increased risk of clinical progression of HIV disease, and that 
reductions in plasma virus levels can be associated with decreased risk of clinical progression. 
Virus levels in the peripheral blood can be quantitated by direct measurement of viral RNA in 
plasma using nucleic acid amplification technologies, such as the polymerase chain reaction 
assay, branched DNA assay and nucleic acid sequence-based amplification assay. These assays 

30 quantify human immunodeficiency virus (HIV) RNA levels. Plasma viral load (PVL) testing has 
become a cornerstone of HIV disease management. Initiation of antiretro viral drug therapy is 
usually recommended when the PVL is 10,000 to 30,000 copies per mL or when CD4+ T- 
lymphocyte counts are less than 350 to 500 per mm 3 (0.35 to 0.50 3 1 0 9 per L). PVL levels 
usually show a 1- to 2-log reduction within four to six weeks after therapy is started. The goal is 
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no detectable virus in 16 to 24 weeks. Periodic monitoring of PVL is important to promptly 
identify treatment failure. The same assay can be used for serial PVL testing in the subject. At 
least two PVL measurements are usually performed before antiretroviral drug therapy is initiated 
or changed. 

5 Stably expressed CEM15 significantly reduced the level of pseudotyped HIV-1 particles 

lacking Vif. The reduced viral particle production is the result of a selective suppression 
of viral RNA leading to reduction in essential HIV-1 proteins. These effects were not 
observed when Vif was expressed due to the marked reduction of CEM15. The data indicate an 
antiviral mechanism in producer cells which is potentially significant late during the viral life 

10 cycle that involves directly or indirectly the RNA binding ability of CEM15 and does not require 
virion incorporation of CEM15 deaminase activity during viral replication. 

One of ordinary skill in the art at the time of the invention would know how to measure 
either DNA, mRNA or protein. For example, they can be measured using a blood sample, a 
cellular extract, or a tissue extract. Urine samples can also be used. 

1 5 Also disclosed are methods of predicting whether a subject is or will be a long term 

nonprogressor (LTNP) when infected with a virus. In one embodiment, this method comprises 
acquiring a biological sample from the subject; and measuring the level of expression of one or 
more APOBEC-1 related proteins in the subject, wherein a higher level of expression as 
compared to a reference level (e.g., normal level) indicates the subject is a potential LNTP. If the 

20 reference level is that of a rapid progressor, then the difference in the levels may be greater. 

A small subset of HIV-infected individuals, known as long-term nonprogressors 
(LNTPs) have substantially slower rates of disease progression in the absence of therapeutic 
intervention. Clinically, these LTNPs are usually asymptomatic, maintain high CD4 counts and 
low HIV viremia levels. The characteristics are therefore of prognostic value in evaluating 

25 disease severity. The mechanisms responsible for long-term nonprogression have previously 
been attributed to defective or less fit HIV variants, strong host immune responses and unique 
host genetic elements, such as the CCR5 genotype and HLA haplotypes (Buchbinder et al. 
(1999) Microbes and Infection 1:1113-1 120). As disclosed in Example 1, these LTNPs are 
associated with higher levels of APOBEC-1 related proteins. 

30 As disclosed above, the indication of a LTNP can be manifested in a number of different 

ways. For example, the decreased severity can comprise high CD4 counts as compared to a 
control. The decreased severity can also comprise lower HIV viremia levels as compared to a 
control. 
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In the methods described above, the subject can have a viral infection when the levels of 
expression are measured. Alternatively, the subject may be free of the viral infection in question, 
and still be tested to determine the likely response of the subject as a potential LTNP. The viral 
infection can be a lentiviral infection, such as HIV-1. 
5 The RNA virus can be selected from the list of viruses consisting of Vesicular stomatitis 

virus, Hepatitis A virus, Hepatitis C virus, Rhinovirus, Coronavirus, Influenza virus A, Influenza 
virus B, Measles virus, Respiratory syncytial virus, Adenovirus, Coxsackie virus, Dengue virus, 
Mumps virus, Poliovirus, Rabies virus, Rous sarcoma virus, Yellow fever virus, Ebola virus, 
Marburg virus, Lassa fever virus, Eastern Equine Encephalitis virus, Japanese Encephalitis virus, 

10 St. Louis Encephalitis virus, Murray Valley fever virus, West Nile virus, Rift Valley fever virus, 
Rotavirus A, Rotavirus B, Rotavirus C, Sindbis virus, Hantavirus, and Rubella virus. 

Also disclosed herein are methods of optimizing antiviral therapy in a subject with a 
viral infection. In one embodiment, the method comprises the steps of acquiring a biological 
sample from the subject; detecting the level of expression of one or more APOBEC-1 related 

1 5 proteins in the sample; and adjusting the antiviral therapy according to the levels APOBEC-1 
related proteins, thereby optimizing the viral therapy. If the antiviral therapy is associated with 
high levels of ARP, this is desired. 

There are many types of antiviral therapy available. These therapies include, but are not 
limited to, nucleoside reverse transcriptase inhibitors, non-nucleoside reverse transcriptase 

20 inhibitors, nucleotide reverse transcriptase inhibitors, protease inhibitors, fusion inhibitors, 
integrase inhibitors, or any combination thereof. 

The antiviral therapy can be reduced when the expression levels of APOBEC-1 related 
proteins is high as compared to a reference level. Many of the therapies available to those with a 
viral infection are expensive and have undesirable side effects. If a subject is expressing high 

25 levels of APOBEC-1 related proteins, antiviral therapy can be reduced accordingly, thereby 
making treatment options customizable to the subject in need thereof. Alternatively, the 
antiviral therapy can be increased when the expression levels of APOBEC-1 related proteins is 
low as compared to a reference level. If the levels are found to be below a normal range, or an 
optimal amount, the treatment can be increased accordingly. 

30 Also disclosed are methods of predicting the level of CD4 cells in a subject, comprising 

acquiring a biological sample from the subject; and detecting the level of CEM15 expression in 
the subject, the level of CEM15 correlating with the level of CD4 cells. As disclosed above, the 
CD4 count has been used as a measurement to determine the strength of the immune system. It 
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can also be used to judge how far a viral infection is advanced (the stage of the disease), and 
helps predict the risk of complications and opportunistic infections. 

Also disclosed is a method of monitoring effectiveness of an antiviral agent in a subject. 
In one embodiment, these steps comprise detecting expression levels of one or more APOBEC-1 
5 related proteins in a first biological sample from the subject prior to administration of the agent; 
and detecting expression levels of one or more APOBEC-1 related proteins in a second or any 
subsequent biological sample from the subject after administration of the agent, an increase in 
expression levels of the APOBEC-1 related proteins in the second or subsequent sample as 
compared to the first sample indicating effectiveness of the antiviral agent. 

10 In one example, the agent administered to the subject targets a Vif/CEM15 interaction. 

The agent can be, for example, a Vif antagonist or a cytidine deaminase activator. The agent can 
also be selected from the group consisting of nucleoside reverse transcriptase inhibitors, non- 
nucleoside reverse transcriptase inhibitors, nucleotide reverse transcriptase inhibitors, protease 
inhibitor, and fusion inhibitors, or a combination thereof. 

1 5 Also disclosed herein are methods for correlating a specific anti-viral therapy with 

CEM15 levels in a subject. For example, disclosed is a method of treating a subject infected with 
a virus with an appropriate antiviral agent, comprising the steps of: identifying a population of 
subjects with a given range of APOBEC-1 related protein levels; determining which antiviral 
agent is most effective at the given range of APOBEC-1 related protein levels; and administering 

20 an appropriate antiviral agent to the subject in need thereof. Therefore, treatment options can be 
customized to an individual based on their specific ARP level. By so doing, subjects can be 
treated based on their specific needs. High levels of CEM15, for example, can dictate that the 
subject is in need of one type of therapy, while low levels of CEM15 can indicate that a different 
type of therapy would be more effective. One of ordinary skill in the art is able to determine 

25 ARP ranges. These levels can then be coordinated to a given treatment therapy, as disclosed 
herein. 

In the methods disclosed herein, the APOBEC-1 related proteins can be selected from 
the group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. APOBEC-3F, has 
potent activity against virion infectivity factor deficient (Avif) human immunodeficiency virus 1 
30 (HIV-1). These enzymes become encapsidated in Avif HIV- 1 virions and in the next round of 
infection deaminate the newly synthesized reverse transcripts. APOBEC-3B and APOBEC-3C 
have potent antiviral activity against simian immuno-deficiency virus (SIV). Both enzymes were 
encapsidated in SIV virions and were active against Avif SlV(mac) and SlV(agm). APOBEC-3B 
induced abundant G to A mutations in both wild-type and Avif SIV reverse transcripts. 
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APOBEC-3C induced substantially fewer mutations. APOBEC-3F was found to be active 
against SIV and sensitive to SlV(mac) Vif. (Yu et al. J Biol Chem. (2004) Dec 
17;279(51):53379-86.) 

Expression of the APOBEC-1 related protein can be measured by detecting DNA, 
5 mRNA or protein levels of the APOBEC-1 related protein. For example, mRNA levels are 
detected by PCR, such as real time PCR (rtPCR). 

PCR is useful for obtaining quantitative information about the expression of many 
different genes in a sample that can contain as little as a single cell. Since the disclosed methods 
are quantitative, comparisons of the expression patterns at a quantitative level between a variety 

10 of different cell states or cell types can be achieved. In general, total RNA can be isolated from 
the target sample using any isolation procedure. This RNA can then be used to generate first 
strand copy DNA (cDNA) using any procedure, for example using random primers or oligo-dt 
primers or random-oligo-dt primers which are oligo-dT primers coupled, on the V end, to short 
stretches of specific sequence covering all possible combinations, so the primer primes at the 

1 5 junction between the polyA tract and non-poly A tract associated with messenger RNA 
(mRNA). The cDNA is then used as a template in a PCR reaction. This PCR reaction is 
performed with primer sets, a forward and a reverse primer, that are specific for the expressed 
genes, which are to be tracked. 

A real time PCR protocol can be used with the methods disclosed herein. These methods, 

20 for example, rely on increases in fluorescence at each cycle of PCR through, for example, the 
release of fluorescein from a quencher sequence while the uniprimer (universal primer) binds to 
the DNA sequence. Fluorescence approaches used in real-time quantitative PCR are typically 
based on a fluorescent reporter dye such as SYBR green, FAM, fluorescein, HEX, TET, 
TAMRA, etc. and a quencher such as DABSYL or Black Hole, for example When the quencher 

25 is separated from the probe during the extension phase of PCR, the fluorescence of the reporter 
can be measured. Systems like Molecular Beacons, Taqman Probes, Scorpion Primers or Sunrise 
Primers and others use this approach to perform real-time quantitative PCR. Examples of 
methods and reagents related to real time probes can be found in United States Patent Nos: 
5,925,517; 6,103,476; 6,150,097, and 6,037,130, which are incorporated by reference herein at 

30 least for material related to detection methods for nucleic acids and PCR methods. 

The cDNA sequences of APOBEC-3B (SEQ ID NO: 9), APOBEC-3C (SEQ ID NO: 
1 1), APOBEC-3F (SEQ ID NO: 13) and CEM15 (SEQ ID NO: 5) are highly homologous but 
have several stretches of non-identity that can be used in the design of specific primers and/or 
probes for the selective real time PCR quantification of each homolog. APOBEC-3C is half the 
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size of APOBEC-3B, APOBEC-3F and CEM15 and is homologous to only the 3' portion of 
these transcripts. Consequently, primer and probe combinations within the 5' half of APOBEC- 
3B, APOBEC-3F and CEM15 does not amplify APOBEC-3C. Importantly APOBEC-3C cDNA 
sequence between nucleotides 1-194 are not well conserved with comparable regions within the 
5 3' half of APOBEC-3B, APOBEC-3F and CEM1 5 and therefore can be used in the design of 
primer and probes for the selective amplification and quantification of APOBEC-3C. For 
example, SEQ ID NO: 17 discloses nucleotides 1-194 of APOBEC-3C and this sequence, along 
with fragments or portions thereof, can be used to specifically amplify or detect APOBEC-3C. 
Regions of APOBEC-3F sequence with significant divergence that has utility in the 
10 selective amplification and quantification this cDNA are apparent from nucleotides 1-60 and 
1328-1725 (SEQ ID NOS: 18 and 19). Moreover, APOBEC-3F has a unique 1000 nucleotide 
long 3' untranslated region that has utility in quantifying this cDNA. SEQ ID NOS: 18 and 19, 
or fragments or portions thereof, can be used to specifically amplify or detect APOBEC-3F. 
APOBEC-3B sequence divergence that has utility in the selective amplification and 
15 quantification this cDNA are apparent from nucleotides 1-67 and 910-1007 (SEQ ID NOS 15 
and 16). SEQ ID NOS: 15 and 16, or fragments or portions thereof, can be used to specifically 
amplify or detect APOBEC-3B. 

Expression can also be measured by detecting protein levels of APOBEC-1 related 
proteins. Such detection can occur, for example, by Western blotting. CEM15 protein levels can 
20 also be detected using ELIS A. Those of skill in the art know how to quantify protein levels 
using Western blotting or ELISA techniques. 

Disclosed herein are methods of screening for an antiviral agent, comprising 
administering to a subject with a viral infection an agent to be screened; and detecting 
expression levels of one or more APOBEC-1 related proteins in a biological sample from the 
25 subject, an increased expression level indicating an antiviral agent. 

As discussed above, an "increased expression level" means an increase in the level of the 
APOBEC-1 related protein as compared to a control. Therefore, the antiviral agent inhibits or 
suppresses viral infectivity. An "inhibitor" or "suppressor" can be anything that reduces 
activity. If the amount of CEM15 is increased in the presence of the composition as compared to 
30 the amount of CEM15 in the absence of the composition, the composition can be said to increase 
the expression level of CEM15. 

The screening methods disclosed herein can be used with a high throughput screening 
assay, for example. The high throughput assay system can comprise an immobilized array of test 
compounds. Alternatively, the Vif molecule or the cytidine deaminase molecule can be 
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immobilized. There are multiple high throughput screening assay techniques that are well 
known in the art (for example, but not limited to, those described in Abriola et al., J. Biomol. 
Screen 4:121-127, 1999; Blevitt et al., J. Biomol. Screen 4:87-91, 2000; Hariharan et aL, J. 
Biomol. Screen 4:187-192, 1999; Fox et al., J. Biomol. Screen 4:183-186, 1999; Burbaum and 
5 Sigal, Curr. Opin. Chem. Biol. 1:72-78, 1997; Jayasena, Clin. Chem. 45:1628-1650, 1999; and 
Famulok and Mayer, Curr. Top. Microbiol. Immunol. 243:123-136, 1999). 

Agents with antiviral activity can be identified from large libraries of natural products or 
synthetic (or semi-synthetic) extracts or chemical libraries according to methods known in the 
art. Those skilled in the field of drug discovery and development will understand that the 

10 precise source of test extracts or compounds is not critical to the screening procedure(s) of the 
invention. Accordingly, virtually any number of chemical extracts or compounds can be 
screened using the exemplary methods described herein. Examples of such extracts or 
compounds include, but are not limited to, plant-, fungal-, prokaryotic- or animal-based extracts, 
fermentation broths, and synthetic compounds, as well as modification of existing compounds. 

15 Numerous methods are also available for generating random or directed synthesis (e.g., semi- 
synthesis or total synthesis) of any number of chemical compounds, including, but not limited to, 
saccharide-, lipid-, peptide-, and nucleic acid-based compounds (e.g., but not limited to, 
antibodies, peptides, and aptamers). Synthetic compound libraries are commercially available, 
e.g., from Brandon Associates (Merrimack, NH) and Aldrich Chemical (Milwaukee, WI). 

20 The ability of a test compound to enhance CEM1 5 expression can be measured by 

contacting the test compound with a cell in the presence of CEM15, either in vivo or in vitro. 
The CEM15 function can be, but is not limited to, its cytidine to uridine editing of RNA, or its 
deoxycytidine to deoxyuridine mutation of DNA, or its suppression of viral activity, or its 
activity on cancerous or precancerous cells. An "increase in CEM15" is defined as a 10%, 20%, 

25 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 2-fold, 10-fold, 100-fold, or 1000-fold increase 
in the amount of the CEM15. Also contemplated is an increase in the activity of CEM15. 

Disclosed herein are primers, probes, and nucleic acid sequences corresponding to 
proteins thereof, such as Vif and the ARP family of proteins. For example SEQ ID NOS: 15-16 
can be used to amplify APOBEC-3B. 

30 It is understood that as discussed herein the use of the terms homology and identity mean 

the same thing as similarity. Thus, for example, if the use of the word homology is used 
between two non-natural sequences it is understood that this is not necessarily indicating an 
evolutionary relationship between these two sequences, but rather is looking at the similarity or 
relatedness between their nucleic acid sequences. Many of the methods for determining 
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homology between two evolutionarily related molecules are routinely applied to any two or 
more nucleic acids or proteins for the purpose of measuring sequence similarity regardless of 
whether they are evolutionarily related or not. 

In general, it is understood that one way to define any known variants and derivatives or 
5 those that might arise, of the disclosed genes and proteins herein, is through defining the variants 
and derivatives in terms of homology to specific known sequences. This identity of particular 
sequences disclosed herein is also discussed elsewhere herein. In general, variants of genes and 
proteins herein disclosed typically have at least, about 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 
81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, or 99 percent homology to 
10 the stated sequence or the native sequence. Those of skill in the art readily understand how to 
determine the homology of two proteins or nucleic acids, such as genes. For example, the 
homology can be calculated after aligning the two sequences so that the homology is at its 
highest level. 

Another way of calculating homology can be performed by published algorithms. 

1 5 Optimal alignment of sequences for comparison may be conducted by the local homology 

algorithm of Smith and Waterman Adv. Appl. Math. 2: 482 (1981), by the homology alignment 
algorithm of Needleman and Wunsch, J. MoL Biol. 48: 443 (1970), by the search for similarity 
method of Pearson and Lipman, Proc. Natl. Acad. Sci. U.S.A. 85: 2444 (1988), by computerized 
implementations of these algorithms (GAP, BESTFIT, FAST A, and TFASTA in the Wisconsin 

20 Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, WI), or by 
inspection. 

The same types of homology can be obtained for nucleic acids by for example the 
algorithms disclosed in Zuker, M. Science 244:48-52, 1989, Jaeger et al. Proc. Natl. Acad. Sci. 
USA 86:7706-7710, 1989, Jaeger et al. Methods Enzymol 183:281-306, 1989 which are herein 
25 incorporated by reference for at least material related to nucleic acid alignment. It is understood 
that any of the methods typically can be used and that in certain instances the results of these 
various methods may differ, but the skilled artisan understands if identity is found with at least 
one of these methods, the sequences would be said to have the stated identity, and be disclosed 
herein. 

30 For example, as used herein, a sequence recited as having a particular percent homology 

to another sequence refers to sequences that have the recited homology as calculated by any one 
or more of the calculation methods described above. For example, a first sequence has 80 
percent homology, as defined herein, to a second sequence if the first sequence is calculated to 
have 80 percent homology to the second sequence using the Zuker calculation method even if 
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the first sequence does not have 80 percent homology to the second sequence as calculated by 
any of the other calculation methods. As another example, a first sequence has 80 percent 
homology, as defined herein, to a second sequence if the first sequence is calculated to have 80 
percent homology to the second sequence using both the Zuker calculation method and the 
5 Pearson and Lipman calculation method even if the first sequence does not have 80 percent 

homology to the second sequence as calculated by the Smith and Waterman calculation method, 
the Needleman and Wunsch calculation method, the Jaeger calculation methods, or any of the 
other calculation methods. As yet another example, a first sequence has 80 percent homology, 
as defined herein, to a second sequence if the first sequence is calculated to have 80 percent 

10 homology to the second sequence using each of calculation methods (although, in practice, the 
different calculation methods will often result in different calculated homology percentages). 

The term hybridization typically means a sequence driven interaction between at least 
two nucleic acid molecules, such as a primer or a probe and a gene. Sequence driven interaction 
means an interaction that occurs between two nucleotides or nucleotide analogs or nucleotide 

15 derivatives in a nucleotide specific manner. For example, G interacting with C or A interacting 
with T are sequence driven interactions. Typically sequence driven interactions occur on the 
Watson-Crick face or Hoogsteen face of the nucleotide. The hybridization of two nucleic acids 
is affected by a number of conditions and parameters known to those of skill in the art. For 
example, the salt concentrations, pH, and temperature of the reaction all affect whether two 

20 nucleic acid molecules will hybridize. 

Parameters for selective hybridization between two nucleic acid molecules are well 
known to those of skill in the art. For example, in some embodiments selective hybridization 
conditions can be defined as stringent hybridization conditions. For example, stringency of 
hybridization is controlled by both temperature and salt concentration of either or both of the 

25 hybridization and washing steps. For example, the conditions of hybridization to achieve 

selective hybridization may involve hybridization in high ionic strength solution (6X SSC or 6X 
SSPE) at a temperature that is about 12-25°C below the Tm (the melting temperature at which 
half of the molecules dissociate from their hybridization partners) followed by washing at a 
combination of temperature and salt concentration chosen so that the washing temperature is 

30 about 5°C to 20°C below the Tm. The temperature and salt conditions are readily determined 
empirically in preliminary experiments in which samples of reference DNA immobilized on 
filters are hybridized to a labeled nucleic acid of interest and then washed under conditions of 
different stringencies. Hybridization temperatures are typically higher for DNA-RNA and RNA- 
RNA hybridizations. The conditions can be used as described above to achieve stringency, or as 
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is known in the art. (Sambrook et al., Molecular Cloning: A Laboratory Manual, 2nd Ed., Cold 
Spring Harbor Laboratory, Cold Spring Harbor, New York, 1989; Kunkel et al. Methods 
Enzymol. 1987:154:367, 1987 which is herein incorporated by reference for material at least 
related to hybridization of nucleic acids). A preferable stringent hybridization condition for a 
5 DNA:DNA hybridization can be at about 68°C (in aqueous solution) in 6X SSC or 6X SSPE 
followed by washing at 68°C. Stringency of hybridization and washing, if desired, can be 
reduced accordingly as the degree of complementarity desired is decreased, and further, 
depending upon the G-C or A-T richness of any area wherein variability is searched for. 
Likewise, stringency of hybridization and washing, if desired, can be increased accordingly as 

10 homology desired is increased, and further, depending upon the G-C or A-T richness of any area 
wherein high homology is desired, all as known in the art. 

Another way to define selective hybridization is by looking at the amount (percentage) 
of one of the nucleic acids bound to the other nucleic acid. For example, in some embodiments 
selective hybridization conditions would be when at least about, 60, 65, 70, 71, 72, 73, 74, 75, 

15 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 

percent of the limiting nucleic acid is bound to the non-limiting nucleic acid. Typically, the non- 
limiting primer is in for example, 10 or 100 or 1000 fold excess. This type of assay can be 
performed at under conditions where both the limiting and non-limiting primer are for example, 
10 fold or 100 fold or 1000 fold below their ka, or where only one of the nucleic acid molecules 

20 is 10 fold or 100 fold or 1000 fold or where one or both nucleic acid molecules are above their 
kd. 

Another way to define selective hybridization is by looking at the percentage of primer 
that gets enzymatically manipulated under conditions where hybridization is required to promote 
the desired enzymatic manipulation. For example, in some embodiments selective hybridization 

25 conditions would be when at least about, 60, 65, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 
83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 percent of the primer is 
enzymatically manipulated under conditions which promote the enzymatic manipulation, for 
example if the enzymatic manipulation is DNA extension, then selective hybridization 
conditions would be when at least about 60, 65, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 

30 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, 100 percent of the primer 

molecules are extended. Preferred conditions also include those suggested by the manufacturer 
or indicated in the art as being appropriate for the enzyme performing the manipulation. 

Just as with homology, it is understood that there are a variety of methods herein 
disclosed for determining the level of hybridization between two nucleic acid molecules. It is 
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understood that these methods and conditions may provide different percentages of hybridization 
between two nucleic acid molecules, but unless otherwise indicated meeting the parameters of 
any of the methods would be sufficient. For example if 80% hybridization was required and as 
long as hybridization occurs within the required parameters in any one of these methods it is 
5 considered disclosed herein. 

It is understood that those of skill in the art understand that if a composition or method 
meets any one of these criteria for determining hybridization either collectively or singly it is a 
composition or method that is disclosed herein. 

There are a variety of molecules disclosed herein that are nucleic acid based, including 

10 for example the nucleic acids that encode primers and probes. The disclosed nucleic acids are 

made up of for example, nucleotides, nucleotide analogs, or nucleotide substitutes. Non-limiting 
examples of these and other molecules are discussed herein. It is understood that for example, 
when a vector is expressed in a cell, that the expressed mRNA will typically be made up of A, C, 
G, and U. Likewise, it is understood that if, for example, an antisense molecule is introduced 

15 into a cell or cell environment through for example exogenous delivery, it is advantagous that 
the antisense molecule be made up of nucleotide analogs that reduce the degradation of the 
antisense molecule in the cellular environment. 

A nucleotide is a molecule that contains a base moiety, a sugar moiety and a phosphate 
moiety. Nucleotides can be linked together through their phosphate moieties and sugar moieties 

20 creating an internucleoside linkage. The base moiety of a nucleotide can be adenin-9-yl (A), 
cytosin-l-yl (C), guanin-9-yl (G), uracil- 1-yl (U), and thymin-l-yl (T). The sugar moiety of a 
nucleotide is a ribose or a deoxyribose. The phosphate moiety of a nucleotide is pentavalent 
phosphate. An non-limiting example of a nucleotide would be 3- AMP (3 '-adenosine 
monophosphate) or 5'-GMP (5-guanosine monophosphate). 

25 A nucleotide analog is a nucleotide which contains some type of modification to either 

the base, sugar, or phosphate moieties. Modifications to nucleotides are well known in the art 
and would include for example, 5-methylcytosine (5-me-C), 5-hydroxymethyl cytosine, 
xanthine, hypoxanthine, and 2-aminoadenine as well as modifications at the sugar or phosphate 
moieties. 

30 Nucleotide substitutes are molecules having similar functional properties to nucleotides, 

but which do not contain a phosphate moiety, such as peptide nucleic acid (PNA). Nucleotide 
substitutes are molecules that will recognize nucleic acids in a Watson-Crick or Hoogsteen 
manner, but which are linked together through a moiety other than a phosphate moiety. 
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Nucleotide substitutes are able to conform to a double helix type structure when interacting with 
the appropriate target nucleic acid. 

It is also possible to link other types of molecules (conjugates) to nucleotides or 
nucleotide analogs to enhance for example, cellular uptake. Conjugates can be chemically 
5 linked to the nucleotide or nucleotide analogs. Such conjugates include but are not limited to 
lipid moieties such as a cholesterol moiety. (Letsinger et al., Proc. Natl. Acad. Sci. USA, 
1989,86, 6553-6556), 

A Watson-Crick interaction is at least one interaction with the Watson-Crick face of a 
nucleotide, nucleotide analog, or nucleotide substitute. The Watson-Crick face of a nucleotide, 

1 0 nucleotide analog, or nucleotide substitute includes the C2, Nl , and C6 positions of a purine 
based nucleotide, nucleotide analog, or nucleotide substitute and the C2, N3, C4 positions of a 
pyrimidine based nucleotide, nucleotide analog, or nucleotide substitute. 

A Hoogsteen interaction is the interaction that takes place on the Hoogsteen face of a 
nucleotide or nucleotide analog, which is exposed in the major groove of duplex DNA. The 

15 Hoogsteen face includes the N7 position and reactive groups (NH2 or O) at the C6 position of 
purine nucleotides. 
Sequences 

There are a variety of sequences related to, for example, CEM15 as well as any other 
protein disclosed herein that are disclosed on Genbank, and these sequences and others are 
20 herein incorporated by reference in their entireties as well as for individual subsequences 
contained therein. 

A variety of sequences are provided herein and these and others can be found in 
Genbank, at www.pubmed.gov . Those of skill in the art understand how to resolve sequence 
discrepancies and differences and to adjust the compositions and methods relating to a particular 

25 sequence to other related sequences. Primers and/or probes can be designed for any sequence 
given the information disclosed herein and known in the art. 

Disclosed are compositions including primers and probes, which are capable of 
interacting with the genes disclosed herein. In certain embodiments the primers are used to 
support DNA amplification reactions. Typically the primers will be capable of being extended 

30 in a sequence specific manner. Extension of a primer in a sequence specific manner includes 

any methods wherein the sequence and/or composition of the nucleic acid molecule to which the 
primer is hybridized or otherwise associated directs or influences the composition or sequence of 
the product produced by the extension of the primer. Extension of the primer in a sequence 
specific manner therefore includes, but is not limited to, PCR, DNA sequencing, DNA 
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extension, DNA polymerization, RNA transcription, or reverse transcription. Techniques and 
conditions that amplify the primer in a sequence specific manner are preferred. In certain 
embodiments the primers are used for the DNA amplification reactions, such as PCR or direct 
sequencing. It is understood that in certain embodiments the primers can also be extended using 
5 non-enzymatic techniques, where for example, the nucleotides or oligonucleotides used to 
extend the primer are modified such that they will chemically react to extend the primer in a 
sequence specific manner. Typically the disclosed primers hybridize with the nucleic acid or 
region of the nucleic acid or they hybridize with the complement of the nucleic acid or 
complement of a region of the nucleic acid. 

10 Disclosed is an isolated nucleic acid sequence comprising a sequence at least 80% 

identical to SEQ LD NO: 1 (5 f CGCAGCCTGTGTCAGAAAAG3 ? ). The nucleic acid sequence 
comprising the nucleotide sequence of SEQ ID NO: 1, or variants or fragments thereof, wherein 
the variant or fragment comprises a specific CEM15 primer. Also contemplated is an isolated 
nucleic acid sequence comprising at least five consecutive nucleotides of SEQ ID NO: 1, 

15 wherein the nucleic acid sequence comprises a specific CEM15 primer. Also contemplated is an 
isolated nucleic acid sequence comprising the sequence of SEQ ID NO: 1 . 

Disclosed herein is an isolated nucleic acid sequence comprising a sequence at least 80% 
identical to SEQ ID NO: 2 (5 ? CCAACAGTGCTGAAATTCGTCATA3'). Contemplated herein 
is an isolated nucleic acid sequence comprising the nucleotide sequence of SEQ ID NO: 2, or 

20 variants or fragments thereof, wherein the variant or fragment comprises a specific CEM15 
primer. Described herein is an isolated nucleic acid sequence comprising at least five 
consecutive nucleotides of SEQ ID NO: 2, wherein the nucleic acid sequence comprises a 
specific CEM15 primer. Further disclosed is an isolated nucleic acid sequence comprising SEQ 
ID NO: 2. 

25 Disclosed herein is an isolated nucleic acid sequence comprising a sequence at least 80% 

identical to SEQ ID NO: 3 (5 'GTGCCACCATGAAGA3 '). Described herein is an isolated 
nucleic acid sequence comprising the nucleotide sequence of SEQ ED NO: 3, or variants or 
fragments thereof, wherein the variant or fragment comprises a specific CEM15 probe. Also 
described is an isolated nucleic acid sequence comprising at least five consecutive nucleotides of 

30 SEQ ID NO: 3, wherein the nucleic acid sequence comprises a specific CEM15 probe. 

Disclosed are antiviral agents identified by the screening methods disclosed herein. The 
antiviral agent can increase the expression level of CEM15 in a subject. Alternatively, the 
antiviral agent can bind, or otherwise interact, with a cytidine deaminase or deoxycytidine 
deaminase, thereby enhancing the normal activity of the cytidine deaminase or deoxycytidine 
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deaminase. For example, a cytidine deaminase activator can interact with CEM15 and enhance 
the binding of CEM15 to a virus. Conversely, a cytidine deaminase activator can interact with 
the binding of Vif to a CEM15 molecule, thereby suppressing the activity of Vif, and indirectly 
enhancing CEM15 binding to HIV. 
5 In the methods disclosed herein, molecules such as CEM15 and Vif can be used in 

assays. These molecules can be, for example, chimeric proteins. By "chimeric protein" is meant 
any single polypeptide unit that comprises two distinct polypeptide domains joined by a peptide 
bond, optionally by means of an amino acid linker, or a non-peptide bond, wherein the two 
domains are not naturally occurring within the same polypeptide unit. Typically, such chimeric 

10 proteins are made by expression of a cDNA construct but could be made by protein synthesis 
methods known in the art. These chimeric proteins are useful in screening compounds, as well 
as with the compounds identified by the methods disclosed herein. 

The compositions disclosed herein can also be fragments or derivatives of a naturally 
occurring deaminase or viral infectivity factor. A "fragment" is a polypeptide that is less than the 

1 5 full length of a particular protein or functional domain. By "derivative" or "variant" is meant a 
polypeptide having a particular sequence that differs at one or more positions from a reference 
sequence. The fragments or derivatives of a full length protein preferably retain at least one 
function of the full length protein. For example, a fragment or derivative of a deaminase 
includes a fragment of a deaminase or a derivative deaminase that retains at least one binding or 

20 deaminating function of the full length protein. By way of example, the fragment or derivative 
can include a Zinc-Dependent Cytidine Deaminase domain or can include 20, 30, 40, 50, 60, 70 
80, 90% similarity with the full length deaminase. The fragment or derivative can include 
conservative or non-conservative amino acid substitutions. The fragment or derivative can 
include a linker sequence joining a catalytic domain (CD) to a pseudo-catalytic domain (PCD) 

25 and can have the domain structure CD-PCD-CD-PCD or any repeats thereof. The fragment or 
derivative can comprise a CD. Other fragments or derivatives are identified by structure-based 
sequence alignment (SBSA) as shown herein. See Figure 2b that reveals the consensus 
structural domain attributes of APOBEC-1 and ARPs (Figure 2c). The fragment or derivative 
optionally can form a homodimer or a homotetramer. Also disclosed are chimeric proteins, 

30 wherein the deaminase domain is a fragment or derivative of CEM1 5 having deaminase 
function. 

"Deaminases" include deoxycytidine deaminase, cytidine deaminase, adenosine 
deaminase, RNA deaminase, DNA deaminase, and other deaminases. Optionally, the deaminase 
is APOBEC-1 (see international patent application designated PCT/US02/05824, which is 
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incorporated herein by reference in its entirety for APOBEC-1, chimeric proteins related thereto, 
and uses thereof) (Gen Bank Accession # NP_001635), REE (see U.S. Pat. No. 5,747,319, 
which is incorporated herein by reference in its entirety for REE and uses thereof), or REE-2 
(see U.S. Pat. No. 5,804,185, which is incorporated herein by reference in its entirety for REE-2 
5 and uses thereof). Deaminases as described herein can include the following structural features: 
three or more CDD-1 repeats, two or more functional CDD-1 repeats, one or more zinc binding 
domains (ZBDs), binding site(s) for mooring sequences, or binding sites for auxiliary RNA 
binding proteins. Deaminases optionally edit viral RNA, host cell mRNA, viral DNA, host cell 
DNA or any combination thereof. One deaminase described herein is CEM15. CEM15 is 

10 homologous to Phorbolin or APOBEC-3G (see, for example, Accession #NP_068594). The 
names CEM15 and APOBEC-3G can be used interchangeably. CEM15 reduces retroviral 
infectivity as an RNA or DNA editing enzyme. 

By "deaminating function" is meant a deamination of a nucleotide (e.g., cytidine, 
deoxycytidine, adenosine, or deoxyadenosine). Deaminating function is detected by measuring 

1 5 the amount of deaminated nucleotide, according to the methods taught herein, wherein such 

levels are above background levels (preferably at least 1.5 -2.5 times the background levels of 
the assay.) 

Protein variants and derivatives are well understood to those of skill in the art and in can 
involve amino acid sequence modifications. For example, amino acid sequence modifications 

20 typically fall into one or more of three classes: substitutional, insertional or deletional variants. 
Insertions include amino and/or carboxyl terminal fusions as well as intrasequence insertions of 
single or multiple amino acid residues. Insertions ordinarily will be smaller insertions than those 
of amino or carboxyl terminal fusions, for example, on the order of one to four residues. 
Immunogenic fusion protein derivatives, such as those described in the examples, are made by 

25 fusing a polypeptide sufficiently large to confer immunogenicity to the target sequence by cross- 
linking in vitro or by recombinant cell culture transformed with DNA encoding the fusion. 
Deletions are characterized by the removal of one or more amino acid residues from the protein 
sequence. Typically, no more than about from 2 to 6 residues are deleted at any one site within 
the protein molecule. These variants ordinarily are prepared by site specific mutagenesis of 

30 nucleotides in the DNA encoding the protein, thereby producing DNA encoding the variant, and 
thereafter expressing the DNA in recombinant cell culture. Techniques for making substitution 
mutations at predetermined sites in DNA having a known sequence are well known, for example 
Ml 3 primer mutagenesis and PCR mutagenesis. Amino acid substitutions are typically of single 
residues, but can occur at a number of different locations at once; insertions usually will be on 
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the order of about from 1 to 10 amino acid residues; and deletions will range about from 1 to 30 
residues. Deletions or insertions preferably are made in adjacent pairs, i.e. a deletion of 2 
residues or insertion of 2 residues. Substitutions, deletions, insertions or any combination 
thereof may be combined to arrive at a final construct. The mutations must not place the 
5 sequence out of reading frame and preferably will not create complementary regions that could 
produce secondary mRNA structure. Substitutional variants are those in which at least one 
residue has been removed and a different residue inserted in its place. Such substitutions 
generally are made in accordance with the following Table 2 and are referred to as conservative 
substitutions. 

10 Substantial changes in function or immunological identity are made by selecting 

substitutions that are less conservative, i.e., selecting residues that differ more significantly in 
their effect on maintaining (a) the structure of the polypeptide backbone in the area of the 
substitution, for example as a sheet or helical conformation, (b) the charge or hydrophobicity of 
the molecule at the target site or (c) the bulk of the side chain. The substitutions which in 

15 general are expected to produce the greatest changes in the protein properties will be those in 
which (a) a hydrophilic residue, e.g. seryl or threonyl, is substituted for (or by) a hydrophobic 
residue, e.g. leucyl, isoleucyl, phenylalanyl, valyl or alanyl; (b) a cysteine or proline is 
substituted for (or by) any other residue; (c) a residue having an electropositive side chain, e.g., 
lysyl, arginyl, or histidyl, is substituted for (or by) an electronegative residue, e.g., glutamyl or 

20 aspartyl; or (d) a residue having a bulky side chain, e.g., phenylalanine, is substituted for (or by) 
one not having a side chain, e.g., glycine, in this case, (e) by increasing the number of sites for 
sulfation and/or glycosylation. 



TABLE 2: Amino Acid Substitutions 



Original Residue 


Exemplary Substitutions 


Ala 


Ser 


Arg 


Lys 


Asn 


Gin 


Asp 


Glu 


Cys 


Ser 


Gin 


Asn 


Glu 


Asp 


Gly 


Pro 


His 


Gin 


He 


Leu; Val 


Leu 


He; Val 


Lys 


Arg; Gin 


Met 


Leu; He 


Phe 


Met; Leu; Tyr 
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Pro Gly 

Ser Thr 

Thr Ser 

Trp Tyr 

Tyr Trp; Phe 

Val He; Leu 



For example, the replacement of one amino acid residue with another that is biologically 
and/or chemically similar is known to those skilled in the art as a conservative substitution. For 
example, a conservative substitution would be replacing one hydrophobic residue for another, or 
5 one polar residue for another. The substitutions include combinations such as, for example, Gly, 
Ala; Val, He, Leu; Asp, Glu; Asn, Gin; Ser, Thr; Lys, Arg; and Phe, Tyr. Such conservatively 
substituted variations of each explicitly disclosed sequence are included within the mosaic 
polypeptides provided herein. 

Substitutional or deletional mutagenesis can be employed to insert sites for N- 
10 glycosylation (Asn-X-Thr/Ser) or O-glycosylation (Ser or Thr). Deletions of cysteine or other 
labile residues also may be desirable. Deletions or substitutions of potential proteolysis sites, 
e.g. Arg, is accomplished for example by deleting one of the basic residues or substituting one 
by glutaminyl or histidyl residues. 

Certain post-translational derivatizations are the result of the action of recombinant host 
15 cells on the expressed polypeptide. Glutaminyl and asparaginyl residues are frequently post- 
translationally deamidated to the corresponding glutamyl and asparyl residues. Alternatively, 
these residues are deamidated under mildly acidic conditions. Other post-translational 
modifications include hydroxylation of proline and lysine, phosphorylation of hydroxyl groups 
of seryl or threonyl residues, methylation of the o-amino groups of lysine, arginine, and histidine 
20 side chains (T.E. Creighton, Proteins: Structure and Molecular Properties, W. H. Freeman & 
Co., San Francisco pp 79-86 [1983]), acetylation of the N-terminal amine and, in some 
instances, amidation of the C-terminal carboxyl. 

The compositions disclosed herein can be used as targets in combinatorial chemistry 
protocols or other screening protocols to isolate molecules that possess desired functional 
25 properties related to increasing the level of an ARPs in a subject. 

As disclosed above, the disclosed compositions, such as cytidine deaminases or 
deoxycytidine deaminases (e.g., CEM15 and other ARPs) or Vif can be used as targets for any 
combinatorial technique to identify molecules or macromolecular molecules that interact with 
the disclosed compositions in a desired way or mimic their function. The nucleic acids, 
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peptides, and related molecules disclosed herein can be used as targets for the combinatorial 
approaches. 

It is understood that when using the disclosed compositions in combinatorial techniques 
or screening methods, molecules, such as macromolecular molecules, will be identified that have 
5 particular desired properties such as stimulation or the target molecule's function. The 

molecules identified and isolated when using the disclosed compositions, such as, CEM15, other 
ARPs, or Vif, are also disclosed. Thus, the products produced using the combinatorial or 
screening approaches that involve the disclosed compositions, such as, CEM15, other ARPs, or 
Vif are also disclosed. Such molecules include Vif antagonists and cytidine deaminase 
10 activators. 

Combinatorial chemistry includes but is not limited to all methods for isolating small 
molecules or macromolecules that are capable of binding either a small molecule or another 
macromolecule like Vif or cytidine deaminase (e.g., CEM15), typically in an iterative process. 
Proteins, oligonucleotides, and sugars are examples of macromolecules. For example, 

1 5 oligonucleotide molecules with a given function, catalytic or ligand-binding, can be isolated 
from a complex mixture of random oligonucleotides in what has been referred to as "in vitro 
genetics 11 (Szostak, TLBS 19:89, 1992). One synthesizes a large pool of molecules bearing 
random and defined sequences and subjects that complex mixture, for example, approximately 
1015 individual sequences in 100 mg of a 100 nucleotide RNA, to some selection and 

20 enrichment process. Through repeated cycles of affinity chromatography and PCR amplification 
of the molecules bound to the ligand on the column, Ellington and Szostak (1990) estimated that 
1 in 1010 RNA molecules folded in such a way as to bind a small molecule dyes. DNA 
molecules with such ligand-binding behavior have been isolated as well (Ellington and Szostak, 
1992; Bock et al, 1992). Techniques aimed at similar goals exist for small organic molecules, 

25 proteins, antibodies and other macromolecules known to those of skill in the art. Screening sets 
of molecules for a desired activity whether based on small organic libraries, oligonucleotides, or 
antibodies is broadly referred to as combinatorial chemistry. Combinatorial techniques are 
particularly suited for defining binding interactions between molecules and for isolating 
molecules that have a specific binding activity, often called aptamers when the macromolecules 

30 are nucleic acids. 

As used herein combinatorial methods and libraries include traditional screening 
methods and libraries as well as methods and libraries used in interative processes. 

The disclosed compositions can be used as targets for any molecular modeling technique 
to identify either the structure of the disclosed compositions or to identify potential or actual 
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molecules, such as small molecules, which interact in a desired way with the disclosed 
compositions. The compounds disclosed herein can be used as targets in any molecular 
modeling program or approach. 

It is understood that when using the disclosed compositions in modeling techniques, 
5 molecules, such as macromolecular molecules, will be identified that have particular desired 
properties such as inhibition, suppression, or stimulation or the target molecule's function. 

One way to isolate molecules that bind a molecule of choice is through rational design. 
This is achieved through structural information and computer modeling. Computer modeling 
technology allows visualization of the three-dimensional atomic structure of a selected molecule 

10 and the rational design of new compounds that will interact with the molecule. The three- 
dimensional construct typically depends on data from x-ray crystallographic analyses or NMR 
imaging of the selected molecule. The molecular dynamics require force field data. The 
computer graphics systems enable prediction of how a new compound will link to the target 
molecule and allow experimental manipulation of the structures of the compound and target 

15 molecule to perfect binding specificity. Prediction of what the molecule-compound interaction 
will be when small changes are made in one or both requires molecular mechanics software and 
computationally intensive computers, usually coupled with user- friendly, menu-driven interfaces 
between the molecular design program and the user. 

Examples of molecular modeling systems are the CHARMm and QUANTA programs, 

20 Polygen Corporation, Waltham, MA. CHARMm performs the energy minimization and 

molecular dynamics functions. QUANTA performs the construction, graphic modeling and 
analysis of molecular structure. QUANTA allows interactive construction, modification, 
visualization, and analysis of the behavior of molecules with each other. 

A number of articles review computer modeling of drugs interactive with specific 

25 proteins, such as Rotivinen, et al., 1988 Acta Pharmaceutica Fennica 97, 159-166; Ripka, New 
Scientist 54-57 (June 16, 1988); McKinaly and Rossmann, 1989 Annu. Rev. Pharmacol. 
Toxiciol. 29, 1 1 1-122; Perry and Davies, QSAR: Quantitative Structure-Activity Relationships 
in Drug Design pp. 189-193 (Alan R. Liss, Inc. 1989); Lewis and Dean, 1989 Proc. R. Soc. 
Lond. 236, 125-140 and 141-162; and, with respect to a model enzyme for nucleic acid 

30 components, Askew, et al., 1989 J. Am. Chem. Soc. Ill, 1082-1090. Other computer programs 
that screen and graphically depict chemicals are available from companies such as BioDesign, 
Inc., Pasadena, CA., Allelix, Inc, Mississauga, Ontario, Canada, and Hypercube, Inc., 
Cambridge, Ontario. Although these are primarily designed for application to drugs specific to 
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particular proteins, they can be adapted to design of molecules specifically interacting with 
specific regions of DNA or RNA, once that region is identified. 

Although described above with reference to design and generation of compounds which 
can alter binding, one can also screen libraries of known compounds, including natural products 
5 or synthetic chemicals, and biologically active materials, including proteins, for compounds 
which alter substrate binding or enzymatic activity. 

Also described is a compound that is identified or designed as a result of any of the 
disclosed methods can be obtained (or synthesized) and tested for its biological activity, e.g., 
competitive stimulation of CEM15 or inhibition or suppression of viral infectivity. 
10 Disclosed herein are computer systems and databases containing information related to 

APOBEC-1 Related Proteins and subjects. Since subjects will vary depending on numerous 
parameters including, but not limited to, race, age, weight, medical history etc., as more 
information is gathered on populations, the database can contain information classified by race, 
age, weight, medical history etc., such that one of skill in the art can assess the subject's risk of 
1 5 developing AIDS, the subject's susceptibility to a viral infection, the subject's ability to mount 
an immune response and/or the subject's responsiveness to a therapeutic agent based on 
information more closely associated with the subject's demographic profile. 

The analysis of complex systems such as biological organisms is aided by the use of 
relational database systems for storing and retrieving large amounts of biological data. The 
20 advent of high-speed wide area networks and the Internet, together with the client/server based 
model of relational database management systems, is particularly well-suited for allowing 
researchers to access and meaningfully analyze large amounts of biological data given the 
appropriate hardware and software computing tools. 

The present invention provides a computer system comprising a) a database including 
25 records comprising a plurality of reference information comprising the ARP level and associated 
diagnosis and therapy data; and b) a user interface capable of receiving a selection of one or 
more sets of information related to the subject's demographic profile. 

Computer readable media include magnetically readable media, optically readable 
media, electronically readable media and magnetic/optical media. For example, the computer 
30 readable media may be a hard disc, a floppy disc, a magnetic tape, CD-ROM, DVD, RAM, or 
ROM as well as other types of other media known to those skilled in the art. 

Embodiments of the present invention include systems, particularly computer systems 
which contain the population information described herein. As used herein, "a computer system" 
refers to the hardware components, software components, and data storage components used to 
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store and/or analyze the information of the present invention or other relevant information. The 
computer system preferably includes the computer readable media described above, and a 
processor for accessing and manipulating the data. 

Preferably, the computer is a general purpose system that comprises a central processing 
5 unit (CPU), one or more data storage components for storing data, and one or more data 

retrieving devices for retrieving the data stored on the data storage components. A skilled artisan 
can readily appreciate that any one of the currently available computer systems are suitable. 

In one particular embodiment, the computer system includes a processor connected to a 
bus which is connected to a main memory, preferably implemented as RAM, and one or more 

10 data storage devices, such as a hard drive and/or other computer readable media having data 

recorded thereon. In some embodiments, the computer system further includes one or more data 
retrieving devices for reading the data stored on the data storage components. The data retrieving 
device may represent, for example, a floppy disk drive, a compact disk drive, a magnetic tape 
drive, a hard disk drive, a CD-ROM drive, a DVD drive, etc. In some embodiments, the data 

15 storage component is a removable computer readable medium such as a floppy disk, a compact 
disk, a magnetic tape, etc. containing control logic and/or data recorded thereon. The computer 
system may advantageously include or be programmed by appropriate software for reading the 
control logic and/or the data from the data storage component once inserted in the data retrieving 
device. Software for accessing and processing the information of the invention (such as search 

20 tools, compare tools, modeling tools, etc.) may reside in main memory during execution. 

Another aspect of the present invention is a method for determining whether a given data 
point from a subject differs from a point, comprising the steps of reading the information 
through use of a computer program which identifies differences between the test subject's 
information and the reference information with the computer program. 

25 

EXAMPLES 

Example 1: APOBEC3G/CEM15/hA3G mRNA Levels Associates Inversely with HIV 
Viremia 

PBMCs were obtained from consenting human subjects and cryopreserved. Prior to 
30 RNA isolation, cyropreserved PBMCs were thawed, washed with PBS, and stimulated with ljag 
each of anti-CD3 and anti-CD28 antibodies for 18-20 hours. 2-5 xlO 6 cells were resuspended in 
1 ml of TriReagent (MRC), and total cellular RNA isolated according to standard protocols. 
PolyA+ RNA was isolated using the MicroPoly (A) Purist kit (Ambion) stored in RNase-free 
water (Ambion) at -80°C. Purified polyA+ RNA was quantified by OD 260 and 280, and all 
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RNAs were found to have a 260/280 ratio of 1.95 or greater. hA3G gene expression was 
examined by using Taqman chemistry with probes and primers designed to uniquely amplify 
hA3G/APOBECEG (NM_021 8220). The primers used were (FWD: 
5 'CGCAGCCTGTGTCAGAAAAG3 ' (SEQ ID NO: 1, nucleotides 637-657), RVSE: 
5 5 ' CC AAC AGTGCTGAAATTCGTC AT A3 * (SEQ ID NO: 2, nucleotide 714-691) and Probe: 
F AM-5 ' GTGCC ACC ATGAAGA3 ' -BHQ 1 (SEQ ID NO: 3, nucleotide 668-682). The 
following dye combinations for probe generation were used for detection and data 
normalization: FAM (for the genes of interest), HEX (for normalize genes, see below) and 
BHQ1 (non- fluorescent quencher) and ROX. Validation experiments were performed to 

10 determine the specificity and efficiency of the primers and probes designed to selectively 
amplify hA3G mRNA over closely related APOBEC3B (hA3B) and APOBEC3F (hA3F) 
(Wedekind et al 2003). A commercially available primer/probe combination was used to 
quantify GAPDH as a normalizing control sequence for the number of cell equivalents in 
polyA+ mRNA starting material used for the quantification of hA3G mRNA. Following probe 

1 5 and primer optimization, all reverse transcriptase, first strand cDNA products were diluted and 
used in a 10 |nl PCR reaction containing: 5 jal of ABI 2x Universal Master Mix, 1.25 jal of each 
forward and reverse primers (final stock concentrations ranging from 200-900nM depending on 
the primer set), 1 yd of probe (stock ranging from 50-200nM) and RNase/DNase free water. All 
reactions were run in an ABI 7900 with 1 cycle of 50 °C (2 min) followed by 95 °C (lOmin.) and 

20 40 cycles of 95 °C (15 sec) followed by 60 °C (1 min). Data were collected and analyzed using 
Sequence Detection Software (ABI, Foster City CA), relative quantitation determined using the 
comparative threshold cycle (CT) method performed in Microsoft Excel (ABI Techno te #2: 
Relative Gene Expression Quantitation). 

Figure 7 is an example of this assay. Real-time PCR assays were performed using 

25 samples from a subject with HIV infection, a positive control (CEM15 control, a plasmid 

encoding hA3G cDNA) and a negative control (APOBEC3F, a plasmid encoding hA3F cDNA) 
(Fig.7a); and GAPDH from the human sample in a separate reaction (Fig.7b) as a control for cell 
number that were used to normalize the hA3G quantification. Each sample was tested in 
duplicates. These results indicated that hA3G quantification was within the reliable detection 

30 limits of the assay. Importantly, GAPDH mRNA was expressed at a similar level in each patient 
sample (Fig. 7b). 

Using this method, six HIV-uninfected and twenty-five antiretroviral naive, chronically 
HIV-infected subjects, including eight LTNPs whose average viral load was 1.8xl0 3 (±l.lxl0 3 ) 
copies/ml, and whose average CD4 count was 755 (+284)/ul; and seventeen progressors whose 
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average viral load was 1.5 xlO 3 (+2.5x1 0 5 ) copies/ml, and whose average CD4 count was 324 
(±208)/jal were studied (Table 3). HIV-l RNA levels were quantified using the Amplicor HIV-l 
Monitor assay (Roche Molecular Systems, Branchburg, NJ), which has a detection limit of 50 
HIV-l RNA copies/ml. The CD4 counts and percentages were determined using whole blood 
5 and the Multiset program (Becton Dickinson, San Jose, CA) by flow cytometer techniques in a 
CLIA certified laboratory. PBMCs from these subjects were stimulated and samples were coded 
and sent to another lab for polyA+ mRNA extraction. The samples were recoded and sent for 
cDNA synthesis and real-time PCR assays. The amounts of hA3G mRNA were standardized 
against the GAPDH levels in each sample, and calculated as copies of mRNA/^ig cDNA. The 

10 hA3G mRNA levels in each subject were determined, and the average values (standard 

deviation) in HIV-uninfected subjects were 132 (±23) copies/^g cDNA, 189 (±59) in LTNPs, 
and 105 (±15) in progressors. In all HIV-infected subjects, it was 132 (±53) (Table 3). By the 
Mann- Whitney test, the hA3G mRNA levels in LTNPs are significantly higher than that in 
progressors (p<0.001) and HIV-uninfected controls (p<0.020). In addition, the hA3G levels in 

15 HIV-uninfected controls is also higher than that in progressors (p<0.008). 

To determine if the augmented hA3G gene expression had any functional implications, 
Rank Correlation Test between hA3G mRNA levels and HIV viremia and CD4 counts in the 
twenty-five HIV-infected individuals was performed. There was a striking inverse correlation 
between hA3G mRNA levels and viral loads (R=-0.7132, p<0.00009) (Fig.8a) and a highly 

20 significant positive correlation between hA3G mRNA levels and CD4 counts (R=0.7029, 

p<0. 00012) (Fig.8b). Moreover, these correlations remain even after removing the one LTNP 
(#1) who has the highest CEM15 value (R--0.5988, p<0.0022 for viral load, and R=0.4962, 
p<0.014 for CD4 count). 



Table 3. CEM15 mRNA levels in HIV-infected and -uninfected study subjects 



Subject group or Viremia CD4 Yr of HIV CEM15 mRNA 

patient no. (copies/ml) count/|il infection copies/^ig of cDNA 

HIV-uninfected' 1 

Mean 132 

SD 23 

HIV-infected 
LTNPs 

1 5.0E±01 1,320 8 321 

2 8.1E±02 492 18 173 

3 1.3E±03 591 9 189 

4 1.7E±03 737 12 114 

5 2.2E±03 648 16 175 

6 2.6E±03 478 15 204 

7 3.0E±03 1,000 18 161 
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a n = 6. 

6 NA, not available. 



Although it has been shown that hA3G contributes to the control of HIV and SIV 
replication in cell cultures and animal experiments (Mariani 2003; Sheehy 2002), these results 
are the first to demonstrate correlations between hA3G mRNA levels and HIV viral load and 
CD4 count, both of which are predictors of HIV disease progression in patients who have not 
received antiretroviral drugs or other forms of therapeutic intervention. In addition, as disclosed 
herein, LTNPs have significantly higher hA3G mRNA levels than did HIV-uninfected controls 
and the progressors, whose hA3G mRNA levels are significantly lower that of HIV-uninfected 
controls. 
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What is claimed is: 

1. A method of predicting the severity of a viral infection in a subject, comprising 

a) acquiring a biological sample from the subject; and 

b) measuring the level of expression of one or more APOBEC-1 related proteins in 
the subject, wherein a higher level of expression as compared to a reference level 
indicates decreased severity. 

2. The method of claim 1, where in the APOBEC-1 related protein is CEM15. 

3. The method of claim 1, where in the APOBEC-1 related protein is selected from the 
group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. 

4. The method of claim 1, wherein the subject has a viral infection when the levels of 
expression are measured. 

5. The method of claim 1, wherein the viral infection is a lenti viral infection. 

6. The method of claim 1, wherein the lentiviral infection is an HIV-1 infection. 

7. The method of claim 1, wherein the decreased severity comprises an increased longevity 
in the subject as compared to a control. 

8. The method of claim 1, wherein the decreased severity comprises a longer asymptomatic 
period in the subject as compared to a control 

9. The method of claim 8, wherein the asymptomatic period is one year or longer than the 
control asymptomatic period. 

10. The method of claim 1, wherein the decreased severity comprises high CD4 counts as 
compared to a control. 

1 1 . The method of claim 1 , wherein the decreased severity comprises lower HIV viremia 
levels as compared to a control. 

12. The method of claim 1 , wherein expression is measured by detecting mRNA levels of 
APOBEC-1 related protein. 

13. The method of claim 12, wherein the mRNA levels are detected by PCR. 

14. The method of claim 13, wherein the PCR is real time PCR (RTPCR). 

15. The method of claim 1 , wherein the expression is measured by detecting protein levels of 
APOBEC-1 related proteins. 

16. The method of claim 15, wherein CEM15 protein levels are detected using Western 
blotting. 

17. The method of claim 15, wherein CEM15 protein levels are detected using ELISA. 

18. The method of claim 1, wherein the sample is a blood sample. 
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19. The method of claim 1, wherein the sample is a cellular extract. 

20. The method of claim 1, wherein the sample is a tissue extract. 

21 . A method of predicting whether a subject is or will be a long term nonprogressor (LTNP) 
when infected with a virus, comprising 

a) acquiring a biological sample from the subject; and 

b) measuring the level of expression of one or more APOBEC-1 related proteins in 
the subject, wherein a higher level of expression as compared to a reference level 
indicates the subject is a potential LNTP. 

22 . The method of claim 2 1 , wherein the APOBEC- 1 related protein is CEM 1 5 . 

23. The method of claim 21, wherein the APOBEC-1 related protein is selected from the 
group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. 

24. The method of claim 21, wherein the subject has a viral infection when the levels of 
expression are measured. 

25. The method of claim 21, wherein expression is measured by detecting mRNA for 
APOBEC-1 related proteins. 

26. The method of claim 25, wherein the mRNA levels are detected by PCR. 

27. The method of claim 26, wherein the PCR is real time PCR (RTPCR). 

28. The method of claim 21, wherein the expression is measured by detecting protein levels 
of the APOBEC-1 related proteins. 

29. The method of claim 28, wherein the protein levels of the APOBEC-1 related proteins 
are detected using Western blotting. 

30. The method of claim 28, wherein CEM 15 protein levels are detected using ELISA. 

3 1 . The method of claim 21 , wherein the sample is a blood sample. 

32. A method of optimizing antiviral therapy in a subject with a viral infection comprising: 

a) acquiring a biological sample from the subject; 

b) detecting the level of expression of one or more APOBEC-1 related proteins in 
the sample; and 

c) adjusting the antiviral therapy according to the levels APOBEC-1 related 
proteins, thereby optimizing the viral therapy. 

33. The method of claim 32, wherein the antiviral therapy is selected from the group 
consisting of a nucleoside reverse transcriptase inhibitor, non-nucleoside reverse 
transcriptase inhibitor, nucleotide reverse transcriptase inhibitor, protease inhibitor, 
fusion inhibitor, and an integrase inhibitor. 
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34. The method of claim 32, wherein the antiviral therapy is reduced when the expression 
levels of APOBEC-1 related proteins is high as compared to a reference level. 

35. The method of claim 32, wherein the antiviral therapy is increased when the expression 
levels of APOBEC-1 related proteins is low as compared to a reference level. 

36. The method of claim 32, where in the APOBEC-1 related protein is CEM15. 

37. The method of claim 32, where in the APOBEC-1 related protein is selected from the 
group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. 

38. A method of predicting the level of CD4 cells in a subject, comprising 

a) acquiring a biological sample from the subject; 

b) detecting the level of CEM15 expression in the subject, the level of CEM15 
correlating with the level of CD4 cells. 

39. A method of monitoring effectiveness of an antiviral agent in a subject, comprising: 

a) detecting expression levels of one or more APOBEC-1 related proteins in a first 
biological sample from the subject prior to administration of the agent; and 

b) detecting expression levels of one or more APOBEC-1 related proteins in a 
second biological sample from the subject after administration of the agent, an 
increase in expression levels of the APOBEC-1 related proteins in the second 
sample as compared to the first sample indicating effectiveness of the antiviral 
agent. 

40. The method of claim 39, wherein the APOBEC-1 related protein is CEM1 5. 

41 . The method of claim 40, wherein the agent targets a Vif7CEM15 interaction. 

42. The method of claim 41, wherein the agent is a Vif antagonist. 

43. The method of claim 39, where in the APOBEC-1 related protein is selected from the 
group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. 

44. The method of claim 39, wherein the agent is selected from the group consisting of 
nucleoside reverse transcriptase inhibitors, non-nucleoside reverse transcriptase 
inhibitors, nucleotide reverse transcriptase inhibitors, protease inhibitor, and fusion 
inhibitors. 

45. The method of claim 39, wherein the subject has a viral infection when the levels of 
expression are measured. 

46. The method of claim 45, wherein the subject is infected with a lentivirus. 

47. The method of claim 45, wherein the lentivirus is HIV-1. 

48. A method of screening for an antiviral agent, comprising 

a) administering to a subject with a viral infection an agent to be screened; and 
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b) detecting expression levels of one or more APOBEC-1 related proteins in a 
biological sample from the subject, an increased expression level indicating an 
antiviral agent. 

49. The method of claim 48, wherein the APOBEC-1 related protein is CEM1 5. 

50. The method of claim 48, wherein the APOBEC-1 related protein is selected from the 
group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. 

5 1 . An isolated nucleic acid sequence comprising a sequence at least 80% identical to SEQ 
IDNO:l. 

52. An isolated nucleic acid sequence comprising the nucleotide sequence of SEQ ID NO: 1, 
or variants or fragments thereof, wherein the variant or fragment comprises a specific 
CEM15 primer. 

53. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 1, wherein the nucleic acid sequence comprises a specific CEM15 primer. 

54. An isolated nucleic acid sequence comprising the sequence of SEQ ID NO: 1 . 

55. An isolated nucleic acid sequence comprising a sequence at least 80% identical to SEQ 
ID NO: 2. 

56. An isolated nucleic acid sequence comprising the nucleotide sequence of SEQ ID NO: 2, 
or variants or fragments thereof, wherein the variant or fragment comprises a specific 
CEM15 primer. 

57. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 2, wherein the nucleic acid sequence comprises a specific CEM15 primer. 

58. An isolated nucleic acid sequence comprising SDEQ ID NO:2. 

59. An isolated nucleic acid sequence comprising a sequence at least 80% identical to SEQ 
ID NO: 3. 

60. An isolated nucleic acid sequence comprising the nucleotide sequence of SEQ ID NO: 3, 
or variants or fragments thereof, wherein the variant or fragment comprises a specific 
CEM15probe. 

61. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 3, wherein the nucleic acid sequence comprises a specific CEM15 probe. 

62. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 15, wherein the nucleic acid sequence comprises a specific APOBEC-3B 
primer. 
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63. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 16, wherein the nucleic acid sequence comprises a specific APOBEC-3B 
primer. 

64. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 17, wherein the nucleic acid sequence comprises a specific APOBEC-3C 
primer. 

65. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 18, wherein the nucleic acid sequence comprises a specific APOBEC-3F 
primer. 

66. An isolated nucleic acid sequence comprising at least five consecutive nucleotides of 
SEQ ID NO: 19, wherein the nucleic acid sequence comprises a specific APOBEC-3F 
primer. 

67. A method of treating a subject infected with a virus with an appropriate antiviral agent, 
comprising the steps of: 

a) identifying a population of subjects with a given range of APOBEC-1 related 
protein levels; 

b) determining which antiviral agent is most effective at the given range of 
APOBEC-1 related protein levels; and 

c) administering an appropriate antiviral agent to the subject in need thereof. 

68. The method of claim 67, wherein the APOBEC-1 related protein is selected from the 
group consisting of CEM15, APOBEC-3B, APOBEC-3C, APOBEC-3F. 

69. The method of claim 67, wherein the agent targets a Vif/CEM15 interaction. 

70. The method of claim 69, wherein the agent is a Vif antagonist. 

71. The method of claim 67, wherein the agent is selected from the group consisting of 
nucleoside reverse transcriptase inhibitors, non-nucleoside reverse transcriptase 
inhibitors, nucleotide reverse transcriptase inhibitors, protease inhibitor, and fusion 
inhibitors. 

72. The method of claim 67, wherein the subject has a viral infection when the levels of 
expression are measured. 

73. The method of claim 72, wherein the subject is infected with a lentivirus. 

74. The method of claim 72, wherein the lentivirus is HIV-1. 
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<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 5 

ctgccagggg gagggcccca gagaaaacca gaaagagggt gagagactga ggaagataaa 6 0 

gcgtcccagg gcctcctaca ccagcgcctg agcaggaagc gggaggggcc atgactacga 12 0 

ggccctggga ggtcacttta gggagggctg tcctaaaacc agaagcttgg agcagaaagt 180 

gaaaccctgg tgctccagac aaagatctta gtcgggacta gccggccaag gatgaagcct 24 0 

cacttcagaa acacagtgga gcgaatgtat cgagacacat tctcctacaa cttttataat 300 

agacccatcc tttctcgtcg gaataccgtc tggctgtgct acgaagtgaa aacaaagggt 360 

ccctcaaggc cccctttgga cgcaaagatc tttcgaggcc aggtgtattc cgaacttaag 42 0 

taccacccag agatgagatt cttccactgg ttcagcaagt ggaggaagct gcatcgtgac 480 

caggagtatg aggtcacctg gtacatatcc tggagcccct gcacaaagtg tacaagggat 540 

atggccacgt tcctggccga ggacccgaag gttaccctga ccatcttcgt tgcccgcctc 600 

tactacttct gggacccaga ttaccaggag gcgcttcgca gcctgtgtca gaaaagagac 660 

ggtccgcgtg ccaccatgaa gatcatgaat tatgacgaat ttcagcactg ttggagcaag 72 0 

ttcgtgtaca gccaaagaga gctatttgag ccttggaata atctgcctaa atattatata 780 

ttactgcaca tcatgctggg ggagattctc agacactcga tggatccacc cacattcact 840 

ttcaacttta acaatgaacc ttgggtcaga ggacggcatg agacttacct gtgttatgag 900 

gtggagcgca tgcacaatga cacctgggtc ctgctgaacc agcgcagggg ctttctatgc 960 

aaccaggctc cacataaaca cggtttcctt gaaggccgcc atgcagagct gtgcttcctg 1020 

gacgtgattc ccttttggaa gctggacctg gaccaggact acagggttac ctgcttcacc 1080 

tcctggagcc cctgcttcag ctgtgcccag gaaatggcta aattcatttc aaaaaacaaa 1140 

cacgtgagcc tgtgcatctt cactgcccgc atctatgatg atcaaggaag atgtcaggag 12 00 

gggctgcgca ccctggccga ggctggggcc aaaatttcaa taatgacata cagtgaattt 1260 

aagcactgct gggacacctt tgtggaccac cagggatgtc ccttccagcc ctgggatgga 1320 

ctagatgagc acagccaaga cctgagtggg aggctgcggg ccattctcca gaatcaggaa 13 80 

aactgaagga tgggcctcag tctctaagga aggcagagac ctgggttgag cctcagaata 144 0 

aaagatcttc ttccaagaaa tgcaaacagg ctgttcacca ccatctccag ctgatcacag 1500 

acaccagcaa agcaatgcac tcctgaccaa gtagattctt ttaaaaatta gagtgcatta 1560 

ctttgaatca aaaatttatt tatatttcaa gaataaagta ctaagattgt gctcaataca 1620 

cagaaaagtt tcaaacctac taatccagcg acaatttgaa tcggttttgt aggtagagga 1680 

ataaaatgaa atactaaatc tttctgtaaa aaaaaaa 1717 



<210> 6 
<211> 384 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 6 

Met Lys Pro His Phe Arg Asn Thr Val Glu Arg Met Tyr Arg Asp Thr 

15 10 15 

Phe Ser Tyr Asn Phe Tyr Asn Arg Pro lie Leu Ser Arg Arg Asn Thr 

20 25 30 

Val Trp Leu Cys Tyr Glu Val Lys Thr Lys Gly Pro Ser Arg Pro Pro 

35 40 45 

Leu Asp Ala Lys lie Phe Arg Gly Gin Val Tyr Ser Glu Leu Lys Tyr 

50 55 60 

His Pro Glu Met Arg Phe Phe His Trp Phe Ser Lys Trp Arg Lys Leu 
65 70 75 80 

His Arg Asp Gin Glu Tyr Glu Val Thr Trp Tyr lie Ser Trp Ser Pro 

85 90 95 

Cys Thr Lys Cys Thr Arg Asp Met Ala Thr Phe Leu Ala Glu Asp Pro 

100 105 110 

Lys Val Thr Leu Thr lie Phe Val Ala Arg Leu Tyr Tyr Phe Trp Asp 
115 120 125 
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Pro 


Asp 


Tvr 


Gin 


Glu 


Ala 


Leu 


Arg 


Ser 


Leu 


Cvs 


Gin 


Lys 


Ara 


Asp 

* XT 


Gly 




13 0 










13 5 










140 










Pro 


Ara 


Ala 


Thr 


Met 


Lys 


He 


Met 


Asn 


Tvr 


Asp 


Glu 


Phe 


Gin 


His 


Cvs 


145 










150 










155 










160 


Trp 


Ser 


Lvs 


Phe 


Val 


Tyr 


Ser 


Gin 


Ara 


Glu 


Leu 


Phe 


Glu 


Pro 


Trp 


Asn 










165 










170 










175 




Asn 


Leu. 


Pro 


Lys 


Tvr 


Tyr 


He 


Leu 


Leu 


His 


He 


Met 


Leu 


Gly 


Glu 


He 








18 0 










185 










190 






Leu. 


Ara 


His 


Ser 


Met 


Asp 


Pro 


Pro 


Thr 


Phe 


Thr 


Phe 


Asn 


Phe 


Asn 


Asn 






1 QC 

1-7 J 










2 0 0 










205 








Glu 


Pro 


Tm 
il r 


Val 


Ara 


Gly Arg 


His 


Glu 


Thr 


Tvr 


Leu 


Cys 


Tvr 


Glu 


Val 




91 n 










Z. J_ 3 










220 










Glu 


Ara 


Met 


His 


Asn 


Asp 


inr 


Tro 


Val 


Leu 


Leu 


Asn 


Gin 


Ara 


Ara 


Gly 


ZZ J 










230 










235 










240 


Phe 


Leu 


Cys 


Asn 


Gin 


Ala 


Pro 


His 


Lys 


His 


Glv 


Phe 


Leu 


Glu 


Glv 


Arg 










^ f± z> 










250 










255 




His 


Ala 


Glu 


Leu 


Cys 


Phe 


Leu 


Asp 


Val 


He 


Pro 


Phe 


Trp 


Lvs 


Leu 


Asp 








0 £ n 

ZDU 










265 










270 






Leu 


Asp 


Gin 


Asp 


Tvr 


Arg 


Val 


Thr 


Cys 


Phe 


Thr 


Ser 


Tro 

XT 


Ser 


Pro 


Cvs 
















2 80 










2 85 








Phe 


Ser* 


Cys 


Ala 


Gin 


Glu 


Met 


Ala 


Lys 


Phe 


He 


Ser 


Lvs 


Asn 


Lvs 
1 


His 




2 90 










295 










3 00 










Val 


Sex* 


Leu 


Cys 


He 


Phe 


Thr 


Ala 


Ara 


He 


Tvr 


Asd 


Asp 


Gin 


Gly 


Arq 


3 05 










310 










315 










320 


Cvs 


Gin 


Glu 


Glv 


Leu 


Arg 


Thr 


Leu 


Ala 


Glu 


Ala 


Gly 


Ala 


Lys 


He 


Ser 










325 










330 










335 




He 


Met 


Thr 


Tyr 


Ser 


Glu 


Phe 


Lys 


His 


Cys 


Trp 


Asp 


Thr 


Phe 


Val 


Asp 








340 










345 










350 






His 


Gin 


Gly 


Cys 


Pro 


Phe 


Gin 


Pro 


Trp 


Asp 


Gly 


Leu 


Asp 


Glu 


His 


Ser 






355 










360 










365 








Gin 


Asp 


Leu 


Ser 


Gly 


Arg 


Leu 


Arg 


Ala 


He 


Leu 


Gin 


Asn 


Gin 


Glu 


Asn 



370 375 380 



<210> 7 
<211> 1155 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 7 

atgaagcetc acttcagaaa cacagtggag cgaatgtatc gagacacatt ctcctacaac 60 

ttttataata gacccatcct ttctegtegg aatacegtet ggctgtgcta cgaagtgaaa 120 

acaaagggtc cctcaaggcc ccctttggac gcaaagatct ttcgaggeca ggtgtattcc 180 

gaacttaagt accacccaga gatgagattc ttccactggt tcagcaagtg gaggaagctg 240 

catcgtgacc aggagtatga ggtcacctgg tacatatcct ggagcccctg cacaaagtgt 300 

acaagggata tggccacgtt cctggccgag gacccgaagg ttaccctgac catcttegtt 360 

gcccgcctct actacttctg ggacccagat taccaggagg cgcttcgcag cctgtgtcag 420 

aaaagagacg gtccgcgtgc caccatgaag atcatgaatt atgacgaatt tcagcactgt 480 

tggagcaagt tcgtgtacag ccaaagagag ctatttgagc cttggaataa tetgectaaa 540 

tattatatat tactgeacat catgctgggg gagattctca gacactcgat ggatccaccc 600 

acattcactt tcaactttaa caatgaacct tgggtcagag gaeggcatga gacttacctg 660 

tgttatgagg tggagegcat gcacaatgac acctgggtcc tgctgaacca gegcagggge 72 0 

tttctatgea accaggctcc acataaacac ggtttccttg aaggccgcca tgcagagctg 780 

tgcttcctgg aegtgattec cttttggaag ctggacctgg accaggacta cagggttacc 840 

tgcttcacct cctggagccc ctgcttcagc tgtgcccagg aaatggctaa attcatttca 900 

aaaaacaaac aegtgagect gtgeatctte actgcccgca tctatgatga tcaaggaaga 960 

tgtcaggagg ggctgcgcac cctggccgag getggggeca aaatttcaat aatgacatac 102 0 

agtgaattta ageactgetg ggacaccttt gtggaccacc agggatgtcc cttccagccc 1080 
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tgggatggac tagatgagca cagccaagac ctgagtggga ggctgcgggc cattctccag 114 0 
aatcaggaaa actga 1155 



<210> 8 

<211> 382 

<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 8 



Met 


Asn 


Pro 


Gin 


1 








Phe 


Tyr 


Asp 


Asn 








20 


Thr 


Trp 


Leu 


Cys 






35 




Leu 


Trp 


Asp 


Thr 










Tyr 


TT -J „ 

xll S 


±\± a 












Pro 


Ala 


Tyr 


Lys 


Cys 


Pro 


Asp 


Cys 








100 


Asn 


Val 


Thr 


Leu 






1 1 c 

J. X 3 




Arg 


Asp 


Tyr 


Arg 




X J v 






vai 


Thr 


i x e 


jyieE. 












Tyr 


Asn 


m n 


Asn 


Tyr 


Ala 


Phe 








1 ft 0 

JL O W 




Asp 


Pro 


Asp 






195 




Leu 


Arg 


Arg 


Arg 




210 






Asn 


Gly 


Thr 


Trp 


225 








Glu 


Ala 


Lys 


Asn 


Arg 


Phe 


Leu 


Asp 








260 


Tyr 


Arg 


Val 


Thr 






275 




Cys 


Ala 


Gly 


Glu 




290 






Leu 


Arg 


He 


Phe 


305 








Glu 


Ala 


Leu 


Gin 


Thr 


Tyr 


Asp 


Glu 








340 


Gly Cys 


Pro 


Phe 



He Arg Asn Pro 
5 

Phe Glu Asn Glu 

Tyr Glu Val Lys 
40 

Gly Val Phe Arg 
55 

Met Cys Phe Leu 
70 

Cys Phe Gin He 
85 

Val Ala Lys Leu 

Thr lie Ser Ala 
120 

Arg Ala Leu Cys 
135 

Asp Tyr Glu Glu 
150 

Gly Gin Gin Phe 
165 

Leu His Arg Thr 



Thr Phe Thr Phe 
200 

Gin Thr Tyr Leu 
215 

Val Leu Met Asp 
230 

Leu Leu Cys Gly 
245 

Leu Val Pro Ser 

Trp Phe He Ser 
280 

Val Arg Ala Phe 
295 

Ala Ala Arg He 
310 

Met Leu Arg Asp 
325 

Phe Glu Tyr Cys 
Gin Pro Trp Asp 



Met Glu Arg Met 
10 

Pro He Leu Tyr 
25 

He Lys Arg Gly 

Gly Gin Val Tyr 
60 

Ser Trp Phe Cys 
75 

Thr Trp Phe Val 
90 

Ala Glu Phe Leu 
105 

Ala Arg Leu Tyr 

Arg Leu Ser Gin 
140 

Phe Ala Tyr Cys 
155 

Met Pro Trp Tyr 
170 

Leu Lys Glu He 
185 

Asn Phe Asn Asn 

Cys Tyr Glu Val 
220 

Gin His Met Gly 
235 

Phe Tyr Gly Arg 
250 

Leu Gin Leu Asp 
265 

Trp Ser Pro Cys 

Leu Gin Glu Asn 
300 

Tyr Asp Tyr Asp 
315 

Ala Gly Ala Gin 
330 

Trp Asp Thr Phe 
345 

Gly Leu Glu Glu 



Tyr Arg Asp Thr 
15 

Gly Arg Ser Tyr 
30 

Arg Ser Asn Leu 
45 

Phe Lys Pro Gin 

Gly Asn Gin Leu 
80 

Ser Trp Thr Pro 
95 

Ser Glu His Pro 
110 

Tyr Tyr Trp Glu 
125 

Ala Gly Ala Arg 
Trp Glu Asn Phe 
160 

Lys Phe Asp Glu 
175 

Leu Arg Tyr Leu 
190 

Asp Pro Leu Val 
205 

Glu Arg Leu Asp 

Phe Leu Cys Asn 
240 

His Ala Glu Leu 
255 

Pro Ala Gin He 
270 

Phe Ser Trp Gly 
285 

Thr His Val Arg 

Pro Leu Tyr Lys 
320 

Val Ser He Met 
335 

Val Tyr Arg Gin 
350 

His Ser Gin Ala 
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355 360 365 

Leu Ser Gly Arg Leu Arg Ala lie Leu Gin Asn Gin Gly Asn 
370 375 380 



<210> 9 
<211> 1536 
<212> DNA 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 9 

acagagcttc aaaaaaagag cgggacaggg acaagcgtat ctaagaggct gaacatgaat 60 

ccacagatca gaaatccgat ggagcggatg tatcgagaca cattctacga caactttgaa 120 

aacgaaccca tcctctatgg tcggagctac acttggctgt gctatgaagt gaaaataaag 180 

aggggccgct caaatctcct ttgggacaca ggggtctttc gaggccaggt gtatttcaag 24 0 

cctcagtacc acgcagaaat gtgcttcctc tcttggttct gtggcaacca gctgcctgct 300 

tacaagtgtt tccagatcac ctggtttgta tcctggaccc cctgcccgga ctgtgtggcg 360 

aagctggccg aattcctgtc tgagcacccc aatgtcaccc tgaccatctc tgccgcccgc 420 

ctctactact actgggaaag agattaccga agggcgctct gcaggctgag tcaggcagga 4 80 

gcccgcgtga cgatcatgga ctatgaagaa tttgcatact gctgggaaaa ctttgtgtac 540 

aatgaaggtc agcaattcat gccttggtac aaattcgatg aaaattatgc attcctgcac 600 

cgcacgctaa aggagattct cagatacctg atggatccag acacattcac tttcaacttt 660 

aataatgacc ctttggtcct tcgacggcgc cagacctact tgtgctatga ggtggagcgc 72 0 

ctggacaatg gcacctgggt cctgatggac cagcacatgg gctttctatg caacgaggct 780 

aagaatcttc tctgtggctt ttacggccgc catgcggagc tgcgcttctt ggacctggtt 840 

ccttctttgc agttggaccc ggcccagatc tacagggtca cttggttcat ctcctggagc 900 

ccctgcttct cctggggctg tgccggggaa gtgcgtgcgt tccttcagga gaacacacac 960 

gtgagactgc gcatcttcgc tgcccgcatc tatgattacg accccctata taaggaggcg 1020 

ctgcaaatgc tgcgggatgc tggggcccaa gtctccatca tgacctacga tgagtttgag 1080 

tactgctggg acacctttgt gtaccgccag ggatgtccct tccagccctg ggatggacta 1140 

gaggagcaca gccaagccct gagtgggagg ctgcgggcca ttctccagaa tcagggaaac 12 00 

tgaaggatgg gcctcagtct ctaaggaagg cagagacctg ggttgagcag cagaataaaa 1260 

gatcttcttc caagaaatgc aaacagaccg ttcaccacca tctccagctg ctcacagaca 1320 

ccagcaaagc aatgtgctcc tgatcaagta gattttttaa aaatcagagt caattaattt 1380 

taattgaaaa tttctcttat gttccaagtg tacaagagta agattatgct caatattccc 1440 

agaatagttt tcaatgtatt aatgaagtga ttaattggct ccatatttag actaataaaa 1500 

cattaagaat cttccataat tgtttccaca aacact 1536 



<210> 10 
<211> 190 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 10 




























Met 


Asn 


Pro 


Gin 


lie 


Arg 


Asn 


Pro 


Met 


Lys 


Ala 


Met 


Tyr 


Pro 


Gly 


Thr 


1 








5 










10 










15 




Phe 


Tyr 


Phe 


Gin 
20 


Phe 


Lys 


Asn 


Leu 


Trp 
25 


Glu 


Ala 


Asn 


Asp 


Arg 
30 


Asn 


Glu 


Thr 


Trp 


Leu 
35 


Cys 


Phe 


Thr 


Val 


Glu 
40 


Gly 


He 


Lys 


Arg 


Arg 
45 


Ser 


Val 


Val 


Ser 


Trp 
50 


Lys 


Thr 


Gly 


Val 


Phe 
55 


Arg 


Asn 


Gin 


Val 


Asp 
60 


Ser 


Glu 


Thr 


His 


Cys 


His 


Ala 


Glu 


Arg 


Cys 


Phe 


Leu 


Ser 


Trp 


Phe 


Cys 


Asp 


Asp 


He 


Leu 



65 70 75 80 



WO 2007/126402 



PCT/US2006/004920 
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Ser 


Pro 


Asn 


Thr 


Lys 


Tyr 


Gin 


Val 


Thr 


Trp 


Tyr 


Thr 


Ser 


Trp 


Ser 


Pro 










85 










90 










95 




Cys 


Pro 


Asp 


Cys 


Ala 


Gly 


Glu 


Val 


Ala 


Glu 


Phe 


Leu 


Ala 


Arg 


His 


Ser 








100 










105 










110 






Asn 


Val 


Asn 


Leu 


Thr 


He 


Phe 


Thr 


Ala 


Arg 


Leu 


Tyr 


Tyr 


Phe 


Gin 


Tyr 






115 










120 










125 








Pro 


Cys 


Tyr 


Gin 


Glu 


Gly 


Leu 


Arg 


Ser 


Leu 


Ser 


Gin 


Glu 


Gly 


Val 


Ala 




130 










135 










140 










Val 


Glu 


He 


Met 


Asp 


Tyr 


Glu 


Asp 


Phe 


Lys 


Tyr 


Cys 


Trp 


Glu 


Asn 


Phe 


145 










150 










155 










160 


Val 


Tyr 


Asn 


Asp 


Asn 


Glu 


Pro 


Phe 


Lys 


Pro 


Trp 


Lys 


Gly 


Leu 


Lys 


Thr 










165 










170 










175 




As n 


Phe 


Arg 


Leu 


Leu 


Lys 


Arg 


Arg 


Leu 


Arg 


Glu 


Ser 


Leu 


Gin 












180 










185 










190 







<210> 11 

<211> 1127 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 11 

ttaaagaggg ctgctcaact gcaaggacgc tgtaagcagg aagagaagcc acagcgcttc 60 

agaaaagagt gggacaggga caagcatatc taagaggctg aacatgaatc cacagatcag 12 0 

aaacccgatg aaggcaatgt atccaggcac attctacttc caatttaaaa acctatggga 180 

agccaacgat cggaacgaaa cttggctgtg cttcaccgtg gaaggtataa agcgccgctc 240 

agttgtctcc tggaagacgg gcgtcttccg aaaccaggtg gattctgaga cccattgtca 300 

tgcagaaagg tgcttcctct cttggttctg cgacgacata ctgtctccta acacaaagta 360 

ccaggtcacc tggtacacat cttggagccc ttgcccagac tgtgcagggg aggtggccga 42 0 

gttcctggcc aggcacagca acgtgaatct caccatcttc accgcccgcc tctactactt 480 

ccagtatcca tgttaccagg aggggctccg cagcctgagt caggaagggg tcgctgtgga 540 

gatcatggac tatgaagatt ttaaatattg ttgggaaaac tttgtgtaca atgataatga 600 

gccattcaag ccttggaagg gattaaaaac caactttcga cttctgaaaa gaaggctacg 660 

ggagagtctc cagtgagggg tctccctggg cctcatggtc tgtctcctct agcctcctgc 720 

tcatgctgca cgggcctccc ctccaccctg gacccgctct gtttctgcct ggtcatcctg 780 

agcccctcct ggcctcaggg ccattccaca gtgctcccct gcctcaccgc ttcctcctcg 840 

ctcttccaga ctcttcctgc agaggctcct ttctgcctcc atggctatcc atccaccccc 900 

acagaccccg ttcctccagc ctgcgtgccc ctaacctggc ttttcccatc tccccagcat 960 

aaccaaatct tactaaactc atcctaggct gggcatggtg actcacgcct gtaatccccc 1020 

agcaatttgg gaggcaaagg tgggagaatc gcgtgagccc aggagttcca gaccaggctg 1080 

ggtcacatga caaagcccca tctctacaaa aaaaaaaaaa aaaaaaa 1127 



<210> 12 
<211> 373 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 12 

Met Lys Pro His Phe Arg Asn Thr Val Glu Arg Met Tyr Arg Asp Thr 

1 5 10 15 

Phe Ser Tyr Asn Phe Tyr Asn Arg Pro He Leu Ser Arg Arg Asn Thr 

20 25 30 

Val Trp Leu Cys Tyr Glu Val Lys Thr Lys Gly Pro Ser Arg Pro Arg 
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-5 3 










4 0 










45 










Asp 


Ala 


Lys 


He 


Phe 


Arc? 


Glv 


Gin 


Val 


Tvr 


Ser 


Gin 


Pro 


Glu 


His 














55 










60 










His 


Ala 


Glu 


Met 


Cvs 


Phe 


Leu 


Ser 




Phe 


Cvs 


Gly 


Asn 


Gin 


Leu 


Pro 


O 3 










70 










75 










80 


Ala 


Tvr 

Jr 


Lys 


Cys 


Phe 


Gin 


He 


Thr 


Trp 

XT 


Phe 


Val 


Ser 


Trp 


Thr 


Pro 


Cys 










85 










90 










95 




Pro 


Asp 


Cys 


Val 


Ala 


Lys 


Leu 


Ala 


Glu 


Phe 


Leu 


Ala 


Glu 


His 


Pro 


Asn 








10 0 










105 










110 






Val 


Thr 


Leu 


Thr 


He 


Ser 


Ala 


Ala 


Arg 


Leu 


Tvr 


Tvr 

JL 


Tvr 


Trp 


Glu 


Arg 






1 1 c 
X -L 3 










12 0 










125 








7\ en 
nop 


±yr 


Arg 




Ala 


Leu 


Cys 


Arci 


Leu 


Ser 


Gin 


Ala 


Glv 

j^ 


Ala 


Arq 


Val 




_L3 U 










X J 3 










14 0 










T.\/"C! 

uy o 


He 


Met 


Asp 


Asp 


Glu 


Glu 


Phe 


Ala 


Tvr 

Jr 


Cvs 


Trp 


Glu 


Asn 


Phe 


Val 


T A C 










i ^ n 

_L 3 \J 










155 










160 


l y j. 


Ser 


Glu 


Glv 


Gin 


Pro 


Phe 


Met 


Pro 


Trp 


Tvr 


Lvs 


Phe 


Asp 


Asp 


Asn 










ID J 










170 










175 




Tyr 




r lie 




His 




Thr 


Leu 


Lys 


Glu 


He 


Leu 


Arq 


Asn 


Pro 


Met 








1 Qfi 

j. o u 










185 










190 












±yr 


Pro 


His 


He 


Phe 


Tvr 


Phe 


His 


Phe 


Lys 


Asn 


Leu 


Arq 






_L J7 3 










2 0 0 










205 








Lys 


Ala 


TVr 

JT 


Glv 


Arg 


Asn 


Glu 


Ser 


Trp 


Leu 


Cvs 


Phe 


Thr 


Met 


Glu 


Val 




Z1U 










215 










22 0 










Val 


Lys 


His 


His 


Ser 


Pro 


Val 


Ser 


Tro 


Lvs 


Arq 


Gly 


Val 


Phe 


Arg 


Asn 


Z. Z, Z> 










2 3 0 










235 










240 


Gin 


Val 


Asp 


Pro 


Glu 


Thr 


His 


Cys 


His 


Ala 


Glu 


Arq 


Cvs 


Phe 


Leu 


Ser 










245 










250 










255 




Tr"r> 


Phe 


Cys 


Asp 


Asp 


He 


Leu 


Ser 


Pro 


Asn 


Thr 


Asn 


Tyr 


Glu 


Val 


Thr 








260 










265 










270 






irp 


±yr 




Ser 


Trn 


Ser 


Pro 


Cys 


Pro 


Glu 


Cys 


Ala 


Glv 


Glu 


Val 


Ala 






..<£ / 3 










2 8 0 










2 85 








Glu 


Phe 


Leu 


Ala 


At a 


His 


Ser 


Asn 


Val 


Asn 


Leu 


Thr 


He 


Phe 


Thr 


Ala 














2 95 










3 00 












Leu 


TVT" 

j.yx. 




Phe 


TrD 


Asp 


Thr 


Asp 


Tvr 


Gin 


Glu 


Gly 


Leu 


Arg 


Ser 


.2 U 3 










310 










315 










320 


Leu 


Ser 


KJ± 11 




u±y 


Ala 


Ser 


Val 


Glu 


He 


Met 


Glv 


Tvr 

x y j_ 


Lys 


Asp 


Phe 










325 










330 










335 




Lys 


Tyr 


Cys 


Trp 


Glu 


Asn 


Phe 


Val 


Tyr 


Asn 


Asp 


Asp 


Glu 


Pro 


Phe 


Lys 








340 










345 










350 






Pro 


Trp 


Lys 


Gly 


Leu 


Lys 


Tyr 


Asn 


Phe 


Leu 


Phe 


Leu 


Asp 


Ser 


Lys 


Leu 






355 










360 










365 








Gin 


Glu 


He 


Leu 


Glu 

























370 



<210> 13 
<211> 2672 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 13 

ttccctttgc aattgccttg ggtcctgccg cacagagegg cctgtcttta tcagaggtcc 60 

ctctgccagg gggagggece cagagaaaac cagaaagagg gtgagagact gaggaagata 120 

aagcgtccca gggcctccta caccagcgcc tgagcaggaa gggggagggg ccatgactac 180 

gaggecctgg gaggtcactt tagggagggc tgtcctgaaa cctggagcct ggagcagaaa 240 

gtgaaaccct ggtgctccag acaaagatct tagtegggae tagccggcca aggatgaagc 300 

ctcacttcag aaacacagtg gagegaatgt atcgagacac attctcctac aacttttata 360 

atagacccat cctttctcgt eggaataccg tctggctgtg ctacgaagtg aaaacaaagg 42 0 
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gtccctcaag gccccgtttg gacgcaaaga tctttcgagg ccaggtgtat tcccagcctg 4 80 

agcaccacgc agaaatgtgc ttcctctctt ggttctgtgg caaccagctg cctgcttaca 540 

agtgtttcca gatcacctgg tttgtatcct ggaccccctg cccggactgt gtggcgaagc 600 

tggccgaatt cctggctgag caccccaatg tcaccctgac catctccgcc gcccgcctct 660 

actactactg ggaaagagat taccgaaggg cgctctgcag gctgagtcag gcaggggccc 720 

gcgtgaagat tatggacgat gaagaatttg catactgctg ggaaaacttt gtgtacagtg 780 

aaggtcagcc attcatgcct tggtacaaat tcgatgacaa ttatgcattc ctgcaccgca 840 

cgctaaagga gattctcaga aacccgatgg aggcaatgta tccacacata ttctacttcc 900 

actttaaaaa cctacgcaaa gcctatggtc ggaacgaaag ctggctgtgc ttcaccatgg 960 

aagttgtaaa gcaccactca cctgtctcct ggaagagggg cgtcttccga aaccaggtgg 102 0 

atcctgagac ccattgtcat gcagaaaggt gcttcctctc ttggttctgt gacgacatac 1080 

tgtctcctaa cacaaactac gaggtcacct ggtacacatc ttggagccct tgcccagagt 1140 

gtgcagggga ggtggccgag ttcctggcca ggcacagcaa cgtgaatctc accatcttca 12 00 

ccgcccgcct ctactacttc tgggatacag attaccagga ggggctccgc agcctgagtc 12 60 

aggaaggggc ctccgtggag atcatgggct acaaagattt taaatattgt tgggaaaact 13 2 0 

ttgtgtacaa tgatgatgag ccattcaagc cttggaaagg actaaaatac aactttctat 13 80 

tcctggacag caagctgcag gagattctcg agtgaggggt ctccccgggc ctcatggtct 144 0 

gtctcctcta gcctcctgct catgttgtgc aggcctcccc tccatcctgg accagctgtg 1500 

cttttgcctg gtcatcctga gcccctcctg gcctcagggc cattccatag tgctcccctg 1560 

cctcaccacc tcctctccgc tctcccaggc tcttcctgca gaggcctctt tctgcctcca 1620 

tggctatcca tccacccacc aagaccctgt tccctgagcc tgcatgcccc taacctgcct 1680 

tttcccatct ccccagcata acctaatatt tttttttttt ttttgagacg gaatttcgct 1740 

ctgtcaccca gactggagtg caatggcttg atcttggctc actgcaaact ctgcctacca 1800 

ggttcaagcg attctcctgc ctccgcctcc cgagtagctg gaattacaga cgcctgccac 1860 

cacgcacagc taactttttt tttttttgta tttttagtag tgactgggtt tcaccatgtt 1920 

ggccaggctg gtcttgaact cctgacctca ggtgatccgc ctatctcagc ctcccaaagt 1980 

gctgggatta caggcgtgag ccactggccc ggcggcacaa ccaaatctta ttaaactcac 204 0 

cctaggctgg ccgcggtgac tcatgcctat aatcccccag caatttggga ggcagaggtg 2100 

agagaatcgc ttgagcccag gaattcgaga ccagcctggg ccacatgaca aagccccatc 2160 

tctacaaaaa aattacaaaa aaaaaaaaaa caggtgtggt ggcatgcacc tgtagtttaa 222 0 

gctgcttgga aggatgaagt gggaggattg cttgagccgg ggaggtggag gctgcagtga 22 80 

actgagatca cgtcactgaa ctccagtctg agcaacagat cgagaccctg cctgaaaata 2340 

aatcaataaa taaactcaac cgaaatgggt atgaaagttg aaatgggtat gtaagttgaa 24 00 

aaccagaagt tttgagaaac atcctttgtt aactttcatc ctacaaattg ggtcattcat 2460 

gtcctacgca gctaaaacag agcccaggag ccagggagga aaagcagtca ggccacacac 252 0 

cattgctccc aaaatggact tctctgcaag cctgactcct gaaactgtgc attgtaccct 25 80 

gaaaccagct ttatccatag cttctgcaat aaatggctgt aagtcttgga aaaaaaaaaa 264 0 

aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa 2672 



<210> 14 

<211> 4 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 14 
Pro Pro Leu Pro 
1 

<210> 15 
<211> 67 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 
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<400> 15 

acagagcttc aaaaaaagag cgggacaggg acaagcgtat ctaagaggct gaacatgaat 6 0 

ccacaga 67 

<210> 16 
<211> 97 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 16 

cctggggctg tgccggggaa gtgcgtgcgt tccttcagga gaacacacac gtgagactgc 60 
gcatcttcgc tgcccgcatc tatgattacg accccct 97 

<210> 17 
<211> 194 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 17 

ttaaagaggg ctgctcaact gcaaggacgc tgtaagcagg aagagaagcc acagcgcttc 6 0 

agaaaagagt gggacaggga caagcatatc taagaggctg aacatgaatc cacagatcag 12 0 

aaacccgatg aaggcaatgt atccaggcac attctacttc caatttaaaa acctatggga 180 

agccaacgat cgga 194 



<210> 18 
<211> 60 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 18 

ttccctttgc aattgccttg ggtcctgccg cacagagcgg cctgtcttta tcagaggtcc 60 

<210> 19 
<211> 397 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 19 

aatgatgatg agccattcaa gccttggaaa ggactaaaat acaactttct attcctggac 60 

agcaagctgc aggagattct cgagtgaggg gtctccccgg gcctcatggt ctgtctcctc 12 0 

tagcctcctg ctcatgttgt gcaggcctcc cctccatcct ggaccagctg tgcttttgcc 180 

tggtcatcct gagcccctcc tggcctcagg gccattccat agtgctcccc tgcctcacca 240 

cctcctctcc gctctcccag gctcttcctg cagaggcctc tttctgcctc catggctatc 300 

catccaccca ccaagaccct gttccctgag cctgcatgcc cctaacctgc cttttcccat 360 

ctccccagca taacctaata tttttttttt ttttttg 397 
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<210> 20 

<211> 24 

<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 20 

atactgctta aagtcgtgac aacc 24 

<210> 21 
<211> 15 
<212> DNA 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 21 

cacggtggta cttct 15 

<210> 22 
<211> 142 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 22 

Met Lys Val Gly Gly lie Glu Asp Arg Gin Leu Glu Ala Leu Lys Arg 

1 5 10 15 

Ala Ala Leu Lys Ala Cys Glu Leu Ser Tyr Ser Pro Tyr Ser His Phe 

20 25 30 

Arg Val Gly Cys Ser lie Leu Thr Asn Asn Asp Val lie Phe Thr Gly 

35 40 45 

Ala Asn Val Glu Asn Ala Ser Tyr Ser Asn Cys lie Cys Ala Glu Arg 

50 55 60 

Ser Ala Met lie Gin Val Leu Met Ala Gly His Arg Ser Gly Trp Lys 
65 70 75 80 

Cys Met Val lie Cys Gly Asp Ser Glu Asp Gin Cys Val Ser Pro Cys 

85 90 95 

Gly Val Cys Arg Gin Phe lie Asn Glu Phe Val Val Lys Asp Phe Pro 

100 105 110 

lie Val Met Leu Asn Ser Thr Gly Ser Arg Ser Lys Val Met Thr Met 

115 120 125 

Gly Glu Leu Leu Pro Met Ala Phe Gly Pro Ser His Leu Asn 
130 135 140 

<210> 23 
<211> 130 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
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synthetic construct 



<400> 23 



Met 


Asn 


Arg 


Gin 


Glu 


Leu 


He 


Thr 


Glu 


Ala 


Leu 


Lys 


Ala 


Arg 


Asp 


Met 


1 








5 










10 










15 




Ala 


Tyr 


Ala 


Pro 


Tvr 


Ser 


Lys 


Phe 


Gin 


Val 


Gly Ala 


Ala 


Leu 


Leu 


Thr 








20 










25 










30 






Lys 


Asp 


Gly 
35 


Lys 


Val 


Tvr 




Glv 
40 


Asn 


He 


Glu 


Asn 


Ala 
45 


Ala 


Tvr 

JL 


Ser 


Met 


Cys 
50 


Asn 


\_ y o 




Glu 


55 


Thr 


Ala 


Leu 


Phe 


Lys 
60 


Ala 


Val 


Ser 


Glu 


Gly Asp 


Thr 


Glu 


Phe 


Gin 


Met 


Leu 


Ala 


Val 


Ala 


Ala 


Asp 


Thr 


Pro 


Gly 


65 










70 










75 










80 


Pro 


Val 


Ser 


Pro 


Cys 
85 


Gly 


Ala 


Cys 


Arg 


Gin 
90 


Val 


He 


Ser 


Glu 


Leu 
95 


Cys 


Thr 


Lys 


Asp 


Val 
100 


He 


Val 


Val 


Leu 


Thr 
105 


Asn 


Leu 


Gin 


Gly 


Gin 
110 


He 


Lys 


Glu 


Met 


Thr 
115 


Val 


Glu 


Glu 


Leu 


Leu 
120 


Pro 


Gly 


Ala 


Phe 


Ser 
125 


Ser 


Glu 


Asp 



Leu His 



130 



<210> 24 
<211> 123 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 24 
Glu Asp Ala Leu 
1 

Arg Thr Pro Leu 
20 

Ser Gly Thr Trp 
35 

Met Gin Gin Thr 
50 

Leu Ser Gly Glu 
65 

Cys Gly His Cys 

Leu Arg He His 
100 

Leu Pro Asp Ala 
115 



Ala Phe Ala Leu 
5 

Ser Asn Phe Asn 

Tyr Phe Gly Ala 
40 

Val His Ala Glu 
55 

Lys Ala Leu Ala 
70 

Arg Gin Phe Met 
85 

Leu Pro Gly Arg 

Phe Gly Pro Lys 
120 



Leu Pro Leu Ala 
10 

Val Gly Ala He 
25 

Asn Met Glu Phe 

Gin Ser Ala He 
60 

Ala He Thr Val 
75 

Asn Glu Leu Asn 
90 

Glu Ala His Ala 
105 

Asp Leu Glu 



Ala Ala Cys Ala 
15 

Ala Arg Gly Val 
30 

He Gly Ala Thr 
45 

Ser His Ala Trp 

Asn Tyr Thr Pro 
80 

Ser Gly Leu Asp 
95 

Leu Arg Asp Tyr 
110 



<210> 25 
<211> 135 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 25 

Met Thr Ser Glu Lys Gly Pro Ser Thr Gly Asp Pro Thr Leu Arg Arg 
15 10 15 
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Arg 


lie 


Glu 


Pro 


Trp 


Glu 


Phe 


Asp 


Val 


Phe 


Tyr 


Asp 


Pro 


Arg 


Glu 


Leu 








20 










25 










30 






Arg 


Lys 


Glu 


Ala 


Cys 


Leu 


Leu 


Tyr 


Glu 


He 


Lys 


Trp 


Gly Met 


Ser 


Arg 






35 










40 










45 








Lys 


lie 


Trp 


Arg 


Ser 


Ser 


Gly 


Lys 


Asn 


Thr 


Thr 


Asn 


His 


Val 


Glu 


Val 




50 










55 










60 










Asn 


Phe 


He 


Lys 


Lys 


Phe 


Thr 


Ser 


Glu 


Arg 


Asp 


Phe 


His 


Pro 


Ser 


He 


65 










70 










75 










80 


Ser 


Cys 


Ser 


He 


Thr 


Trp 


Phe 


Leu 


Ser 


Trp 


Ser 


Pro 


Cys 


Trp 


Glu 


Cys 










85 










90 










95 




Ser 


Gin 


Ala 


He 


Arg 


Glu 


Phe 


Leu 


Ser 


Arg 


His 


Pro 


Gly 


Val 


Thr 


Leu 








100 










105 










110 






Val 


He 


Tyr 


Val 


Ala 


Arg 


Leu 


Phe 


Trp 


His 


Met 


Asp 


Gin 


Gin 


Asn 


Arg 






115 










120 










125 








Gin 


Gly 


Leu 


Arg 


Asp 


Leu 


Val 






















130 










135 





















<210> 26 

<211> 130 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 26 



Met 


Asp 


Ser 


Leu 


Leu 


Met 


Asn 


Arg 


Arg 


Lys 


Phe 


Leu 


Tyr 


Gin 


Phe 


Lys 


1 








5 










10 










15 




Asn 


Val 


Arg 


Trp 


Ala 


Lys 


Gly 


Arg 


Arg 


Glu 


Thr 


Tyr 


Leu 


Cys 


Tyr 


Val 








20 










25 










30 






Val 


Lys 


Arg 


Arg 


Asp 


Ser 


Ala 


Thr 


Ser 


Phe 


Ser 


Leu 


Asp 


Phe 


Gly 


Tyr 






35 










40 










45 








Leu 


Arg 


Asn 


Lys 


Asn 


Gly 


Cys 


His 


Val 


Glu 


Leu 


Leu 


Phe 


Leu 


Arg 


Tyr 




50 










55 










60 










He 


Ser 


Asp 


Trp 


Asp 


Leu 


Asp 


Pro 


Gly 


Arg 


Cys 


Tyr 


Arg 


Val 


Thr 


Trp 


65 










70 










75 










80 


Phe 


Thr 


Ser 


Trp 


Ser 


Pro 


Cys 


Tyr 


Asp 


Cys 


Ala 


Arg 


His 


Val 


Ala 


Asp 










85 










90 










95 




Phe 


Leu 


Arg 


Gly 


Asn 


Pro 


Asn 


Leu 


Ser 


Leu 


Arg 


He 


Phe 


Thr 


Ala 


Arg 








100 










105 










110 






Leu 


Tyr 


Phe 


Cys 


Glu 


Asp 


Arg 


Lys 


Ala 


Glu 


Pro 


Glu 


Gly 


Leu 


Arg 


Arg 






115 










120 










125 









Leu His 



130 

<210> 27 
<211> 138 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 27 

Met Lys Pro His Phe Arg Asn Thr Val Glu Arg Met Tyr Arg Asp Thr 

1 5 10 15 

Phe Ser Tyr Asn Phe Tyr Asn Arg Pro He Leu Ser Arg Arg Asn Thr 
20 25 30 
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Val Trp Leu Cys Tyr Glu Val Lys Tyr Lys Gly Pro Ser Arg Pro Pro 

35 40 45 

Leu Asp Ala Lys lie Phe Arg Gly Gin Val Tyr Ser Glu Leu Lys His 

50 55 60 

Pro Glu Met Arg Phe Glu His Trp Phe Ser Lys Trp Arg Lys Leu His 
65 70 75 80 

Arg Asp Gin Glu Tyr Glu Val Thr Trp Tyr lie Ser Trp Ser Pro Cys 

85 90 95 

Thr Lys Cys Thr Arg Asp Met Ala Thr Phe Leu Ala Glu Asp Pro Lys 

100 105 110 

Val Thr Leu Thr lie Phe Val Ala Arg Leu Tyr Tyr Phe Trp Asp Pro 

115 120 125 

Asp Tyr Gin Glu Ala Leu Arg Ser Leu Cys 
130 135 

<210> 28 
<211> 121 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 28 
Glu Pro Trp Val 
1 

Glu Arg Met His 
20 

Phe Leu Cys Asn 
35 

His Ala Glu Leu 
50 

Leu Asp Gin Asp 
65 

Phe Ser Cys Ala 

Val Ser Leu Cys 
100 

Cys Gin Glu Gly 
115 



Arg Gly Arg His 
5 

Asn Asp Thr Trp 

Gin Ala Pro His 
40 

Cys Phe Leu Asp 
55 

Tyr Arg Val Thr 
70 

Gin Glu Met Ala 
85 

lie Phe Thr Ala 

Leu Arg Thr Leu 
120 



Glu Thr Tyr Leu 
10 

Val Leu Leu Asn 
25 

Lys His Gly Phe 

Val lie Pro Phe 
60 

Cys Phe Thr Ser 
75 

Lys Phe lie Ser 
90 

Arg lie Tyr Asp 

105 

Ala 



Cys Tyr Glu Val 
15 

Gin Arg Arg Gly 
30 

Leu Glu Gly Arg 
45 

Trp Lys Leu Asp 

Trp Ser Pro Cys 
80 

Lys Asn Lys His 
95 

Asp Gin Gly Arg 
110 



<210> 29 
<211> 17 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 29 

lie Lys Pro Leu Leu Met Asp Glu Gin Asp His Gly Tyr Ala Leu Thr 
15 10 15 

Gly 



<210> 30 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
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<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 30 

Asn Ser Gly Val Thr lie Gin lie 

1 5 

<210> 31 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 31 

Arg Ala Gly Val Gin lie Ala lie 
1 5 

<210> 32 
<211> 11 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 32 

Lys Arg Asp Gly Pro Arg Ala Thr Met Lys lie 
15 10 

<210> 33 
<211> 8 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 33 

Glu Ala Gly Ala Lys lie Ser lie 
1 5 

<210> 34 
<211> 142 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 34 

Met Lys Val Gly Gly lie Glu Asp Arg Gin Leu Glu Ala Leu Lys Arg 
15 10 15 
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Ala 


Ala 


Leu 


Lvs 
20 


Ala 


Cvs 


Glu 


Leu 


Ser 
25 


Tyr 


Ser 


Pro 


Tyr 


Ser 
30 


His 


Phe 


Arg 


Val 


Gly 
35 


Cys 


Ser 


He 


Leu 


Thr 
40 


Asn 


Asn 


Asp 


Val 


He 
45 


Phe 


Thr 


Gly 


Ala 


Asn 
50 


Val 


Glu 


Asn 


Ala 


Ser 
55 


Tyr 


Ser 


Asn 


Cys 


He 
60 


Cys 


Ala 


Glu 


Arg 


Ser 


Ala 


Met 


He 


Gin 


Val 


Leu 


Met 


Ala 


Gly 


His 


Arg 


Ser Gly Trp 


Lys 


65 










70 










75 










80 


Cys 


Met 


Val 


He 


Cys 
85 


Gly 


Asp 


Ser 


Glu 


Asp 
90 


Gin 


Cys 


Val 


Ser 


Pro 
95 


Cys 


Gly 


Val 


Cys 


Arg 
100 


Gin 


Phe 


He 


Asn 


Glu 
105 


Phe 


Val 


Val 


Lys 


Asp 
110 


Phe 


Pro 


He 


Val 


Met 
115 


Leu 


Asn 


Ser 


Thr 


Gly 
120 


Ser 


Arg 


Ser 


Lys 


Val 
125 


Met 


Thr 


Met 


Gly 


Glu 
130 


Leu 


Leu 


Pro 


Met 


Ala 
135 


Phe 


Gly 


Pro 


Ser 


His 
140 


Leu 


Asn 







<210> 35 
<211> 131 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 35 




























Met 


Asn 


Arg 


Gin 


Glu 


Leu 


He 


Thr 


Glu 


Ala 


Leu 


Lys 


Ala 


Arg 


Asp 


Met 


1 








5 










10 










15 




Ala 


Tyr 


Ala 


Pro 
20 


Tyr 


Ser 


Lys 


Phe 


Gin 
25 


Val 


Gly 


Ala 


Ala 


Leu 
30 


Leu 


Thr 


Lys 


Asp 


Gly 
35 


Lys 


Val 


Tyr 


Arg 


Gly 
40 


Cys 


Asn 


He 


Glu 


Asn 
45 


Ala 


Ala 


Tyr 


Ser 


Met 
50 


Cys 


Asn 


Cys 


Ala 


Glu 
55 


Arg 


Thr 


Ala 


Leu 


Phe 
60 


Lys 


Ala 


Val 


Ser 


Glu 


Gly 


Asp 


Thr 


Glu 


Phe 


Gin 


Met 


Leu 


Ala 


Val 


Ala 


Ala 


Asp 


Thr 


Pro 


65 










70 










75 










80 


Gly 


Pro 


Val 


Ser 


Pro 
85 


Cys 


Gly 


Ala 


Cys 


Arg 
90 


Gin 


Val 


He 


Ser 


Glu 
95 


Leu 


Cys 


Thr 


Lys 


Asp 
100 


Val 


He 


Val 


Val 


Leu 
105 


Thr 


Asn 


Leu 


Gin 


Gly 
110 


Gin 


He 


Lys 


Glu 


Met 
115 


Thr 


Val 


Glu 


Glu 


Leu 
120 


Leu 


Pro 


Gly 


Ala 


Phe 
125 


Ser 


Ser 


Glu 


Asp 


Leu 
130 


His 





























<210> 36 
<211> 106 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 36 

Asp Ala Leu Ser Gin Ala Ala He Ala Ala Ala Asn Arg Ser His Met 

15 10 15 

Pro Tyr Ser Lys Ser Pro Ser Gly Val Ala Leu Glu Cys Lys Asp Gly 
20 25 30 
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Arg 


He 


Phe 
35 


Ser 


Gly 


Ser 


Tyr 


Ala 
40 


Leu 


Pro 


Pro 


Leu 


Gin 


Gly Ala 


Leu 




50 










55 




Asp 


Tyr 


Pro 


Asp 


He 


Gin 


Arg 


Ala 


65 










70 






Pro 


Leu 


He 


Gin 


Trp 
85 


Asp 


Ala 


Thr 


Cys 


His 


Ser 


He 
100 


Asp 


Arg 


Val 


Leu 



17/24 

Glu Asn Ala Ala Phe Asn Pro Thr 
45 

He Leu Leu Asn Leu Lys Gly Tyr 
60 

Val Leu Ala Glu Lys Ala Asp Ala 

75 80 
Ser Ala Thr Leu Lys Ala Leu Gly 

90 95 
Leu Ala 
105 



<210> 37 
<211> 135 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 37 



Met 


Thr 


Ser 


Glu 


Lys 


Gly 


Pro 


Ser 


Thr 


Gly 


Asp 


Pro 


Thr 


Leu 


Arg 


Arg 


1 








5 










10 










15 




Arg 


He 


Glu 


Pro 


Trp 


Glu 


Phe 


Asp 


Val 


Phe 


Tyr 


Asp 


Pro 


Arg 


Glu 


Leu 








20 










25 










30 






Arg 


Lys 


Glu 


Ala 


Cys 


Leu 


Leu 


Tyr 


Glu 


He 


Lys 


Trp 


Gly 


Met 


Ser 


Arg 






35 










40 










45 








Lys 


He 


Trp 


Arg 


Ser 


Ser 


Gly 


Lys 


Asn 


Thr 


Thr 


Asn 


His 


Val 


Glu 


Val 




50 










55 










60 










Asn 


Phe 


He 


Lys 


Lys 


Phe 


Thr 


Ser 


Glu 


Arg 


Asp 


Phe 


His 


Pro 


Ser 


He 


65 










70 










75 










80 


Ser 


Cys 


Ser 


He 


Thr 


Trp 


Phe 


Leu 


Ser 


Trp 


Ser 


Pro 


Cys 


Trp 


Glu 


Cys 










85 










90 










95 




Ser 


Gin 


Ala 


He 


Arg 


Ser 


Phe 


Leu 


Ser 


Arg 


His 


Pro 


Gly 


Val 


He 


Leu 








100 










105 










110 






Val 


He 


Tyr 


Val 


Ala 


Arg 


Leu 


Phe 


Trp 


His 


Asn 


Asp 


Gin 


Gin 


Asn Arg 






115 










120 










125 








Gin 


Gly 


Leu 


Arg 


Asp 


Leu 


Val 





















130 135 



<210> 38 

<211> 92 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 38 

Met Arg Ala Ser Glu Tyr Tyr His Cys Trp Arg Asn Phe Val Asn Tyr 

15 10 15 

Pro Pro Gly Asp Glu Ala His Trp Pro Gin Tyr Pro Pro Leu Trp Met 

20 25 30 

Met Leu Tyr Ala Leu Glu Leu His Cys He He Leu Ser Leu Pro Pro 

35 40 45 

Cys Leu Lys He Ser Arg Arg Trp Gin Asn His Leu Thr Phe Phe Glu 

50 55 60 

Leu His Leu Gin Asn Cys His Tyr Gin Thr He Pro Pro His He Leu 
65 70 75 80 

Leu Ala Thr Leu He His Pro Ser Val Ala Trp Arg 
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85 90 



<210> 39 

<211> 60 

<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 39 
Met Thr Phe Lys 
1 

His Glu Arg Thr 
20 

Arg Leu Thr Arg 
35 

Asp Asp Leu Arg 
50 



Asp Tyr Phe Tyr 
5 

Phe Lys Ala Trp 

Gin Leu Arg Arg 
40 

Asp Ala Phe Arg 
55 



Cys Trp Asn Thr 
10 

Glu Gly Leu His 
25 

lie Leu Leu Pro 

Thr Leu Gly Leu 
60 



Phe Val Glu Asn 
15 

Glu Asn Ser Val 
30 

Leu Tyr Glu Val 
45 



<210> 40 

<211> 57 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 40 
Met Asn Tyr Asp 
1 

Gin Arg Glu Leu 
20 

Leu Leu His lie 
35 

Pro Thr Phe Thr 
50 



Glu Phe Gin His 
5 

Phe Glu Pro Trp 

Met Leu Gly Glu 
40 

Phe Asn Phe Asn 
55 



Cys Trp Ser Lys 
10 

Asn Asn Leu Pro 
25 

lie Leu Arg His 
Asn 



Phe Val Tyr Ser 
15 

Lys Tyr Tyr lie 
30 

Ser Met Asp Pro 
45 



<210> 41 

<211> 48 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 41 

Met Thr Tyr Ser Glu Phe Lys His Cys Trp Asp Thr Phe Val Asp His 

15 10 15 

Gin Gly Cys Pro Phe Gin Pro Trp Asp Gly Leu Asp Glu His Ser Gin 

20 25 30 

Asp Leu Ser Gly Arg Leu Arg Ala lie Leu Gin Asn Gin Gly Glu Asn 
35 40 45 



<210> 42 

<211> 40 

<212> PRT 

<213> Artificial 



Sequence 
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<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<221> VARIANT 
<222> (12) . . . (27) 

<223> Xaa can be any amino acid 



<400> 42 

Thr Asn His Val Glu Val Asn Phe lie Lys Lys Xaa Xaa Xaa Xaa Xaa 

15 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Trp Phe Leu Ser Trp 

20 25 30 

Ser Pro Cys Trp Glu Cys Ser Gin 
35 40 



<210> 43 

<211> 39 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<221> VARIANT 
<222> (12) . . . (26) 

<22 3> Xaa can be any amino acid 



<400> 43 

Gly Cys His Val Glu Leu Leu Phe 

1 5 
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
20 

Pro Cys Tyr Asp Cys Ala Arg 
35 



Leu Arg Tyr Xaa Xaa Xaa Xaa Xaa 

10 15 
Xaa Xaa Trp Phe Thr Ser Trp Ser 
25 30 



<210> 44 

<211> 40 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<221> VARIANT 
<222> (12) . . . (27) 

<223> Xaa can be any amino acid 



<400> 44 

Lys Tyr His Pro Glu Met Arg Phe 

1 5 
Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa 
20 

Ser Pro Cys Thr Lys Cys Thr Arg 
35 40 



Phe His Trp Xaa Xaa Xaa Xaa Xaa 

10 15 

Xaa Xaa Xaa Trp Tyr lie Ser Trp 
25 30 



<210> 45 
<211> 39 
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<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<221> VARIANT 
<222> (12) . . . (26) 

<223> Xaa can be any amino acid 
<400> 45 

Gly Arg His Ala Glu Leu Cys Phe Leu Asp Val Xaa Xaa Xaa Xaa Xaa 

15 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Cys Phe Thr Ser Trp Ser 

20 25 30 

Pro Cys Phe Ser Cys Ala Gin 
35 

<210> 46 
<211> 35 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<221> VARIANT 

<222> (12) ... (22) 1 

<223> Xaa can be any amino acid 

<400> 46 

Thr Val His Ala Glu Gin Ser Ala lie Ser His Xaa Xaa Xaa Xaa Xaa 

15 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa lie Thr Val Asn Tyr Thr Pro Cys Gly His 
20 25 30 

Cys Arg Gin 
35 

<210> 47 
<211> 43 

<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<221> VARIANT 
<222> (12) . . . (30) 

<22 3> Xaa can be any amino acid 
<400> 47 

Cys lie Cys Ala Glu Arg Ser Ala Met lie Gin Xaa Xaa Xaa Xaa Xaa 

1 5 10 15 

Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Xaa Glu Asp 

20 25 30 

Gin Cys Val Ser Pro Cys Gly Val Cys Arg Gin 
35 40 
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<210> 48 
<211> 11 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 48 

Thr Asn His Val Glu Val Asn Phe lie Lys Lys 
15 10 



<210> 49 
<211> 13 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 49 

Trp Phe Leu Ser Trp Ser Pro Cys Trp Glu Cys Ser Gin 
15 10 



<210> 50 
<211> 11 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 50 

Gly Cys His Val Glu Leu Leu Phe Leu Arg Tyr 
1 5 10 



<210> 51 
<211> 13 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 51 

Trp Phe Thr Ser Trp Ser Pro Cys Tyr Asp Cys Ala Arg 
15 10 



<210> 52 

<211> 11 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
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synthetic construct 



<400> 52 

Ala Ala His Ala Glu Glu Ala Phe Phe Asn Thr 
15 10 



<210> 53 

<211> 13 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 53 

Trp Tyr Val Ser Ser Ser Pro Cys Ala Ala Cys Ala Asp 
1 5 10 



<210> 54 

<211> 11 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 54 

His Cys His Ala Glu Arg Cys Phe Leu Ser Trp 
1 5 10 

<210> 55 
<211> 13 
<212> PRT 

<213> Artificial Sequence 
<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 55 

Trp Tyr Thr Ser Trp Ser Pro Cys Pro Asp Cys Ala Gly 
1 5 10 



<210> 56 
<211> 11 
<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 56 

Gly Arg His Ala Glu Leu Cys Phe Leu Asp Val 
15 10 



<210> 57 
<211> 13 
<212> PRT 
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<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 57 

Cys Phe Thr Ser Trp Ser Pro Cys Phe Ser Cys Ala Gin 
15 10 



<210> 58 

<211> 11 

<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 58 

Lys Tyr His Pro Glu Met Arg Phe Phe His Trp 
15 10 



<210> 59 

<211> 13 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 59 

Trp Tyr lie Ser Trp Ser Pro Cys Thr Lys Cys Thr Arg 
15 10 



<210> 60 

<211> 11 

<212> PRT 

<213> Artificial Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 60 

Thr Val His Ala Glu Gin Ser Ala lie Ser His 
15 10 



<210> 61 

<211> 13 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 61 

lie Thr Val Asn Tyr Thr Pro Cys Gly His Cys Arg Gin 
1 5 10 
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<210> 62 

<211> 11 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 

<400> 62 

Val Cys His Ala Glu Leu Asn Ala lie Met Asn 
15 10 



<210> 63 

<211> 13 

<212> PRT 

<213> Artificial 



Sequence 



<220> 

<223> Description of Artificial Sequences; note = 
synthetic construct 



<400> 63 

Met Tyr Val Ala Leu Phe Pro Cys Asn Glu Cys Ala Lys 
1 5 10 



